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ABSTRACT 


Drug  addiction  in  the  United  States  generates  significant  health,  economic,  and  social 
costs.  One  of  the  prominent  ways  in  which  traffickers  smuggle  drugs  into  the  United 
States  is  by  maritime  shipments  from  South  America.  In  1989  Joint  Interagency  Task 
Force  South  (JIATF-S)  was  established  to  fight  these  traffickers.  JIATF-S  collects 
information  from  multiple  sources,  which  can  be  broadly  classified  into  two  categories. 
The  first  category  is  sensor-based  sources  that  produce  observations  about  possible 
targets  (e.g.,  radar,  sonar).  These  observations  provide  precise  location  and  time  but  are 
susceptible  to  false  positive  and  false  negative  errors  regarding  their  content.  The  second 
category  is  human-based  sources,  including  tips,  messages  and  intercepted 
communications  among  humans.  In  addition  to  possible  misinformation  regarding  the 
content  of  an  event,  such  inputs  are  also  susceptible  to  errors  regarding  the  location  and 
time  of  the  event. 

In  this  thesis  we  develop  a  data  fusion  model  that  can  assist  JIATF-S  in  estimating 
the  likelihood  that  a  certain  target  (i.e.,  drug- smuggling  vessel)  is  present  at  a  certain 
location  at  a  certain  time  and  evaluate  the  reliability  of  the  infonnation  source. 

The  novelty  of  this  thesis  is  manifested  in  a  new  probabilistic  approach  for 
utilizing  human-generated  intelligence,  and  in  the  way  it  is  combined  with  sensor¬ 
generated  intelligence. 
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EXECUTIVE  SUMMARY 


Data  fusion  from  various  sources  is  a  common  problem  for  intelligence  organizations 
around  the  world.  In  this  thesis  we  explore  the  efforts  of  the  Joint  Interagency  Task  Force 
South,  an  organization  established  in  1989  to  fight  drug  traffickers  originating  from 
South  America,  to  combine  different  sources  of  intelligence  into  a  coherent  picture  to 
seize  the  smuggled  drugs. 

In  this  thesis  we  examine  the  combination  of  two  categories  of  intelligence 
sources  regarding  drug  smugglers:  (1)  sensor-based  sources  such  SIGINT  (signal 
intelligence)  and  VISINT  (visual  intelligence),  and  (2)  human-based  sources  such  as 
HUMINT  (human  intelligent)  and  COMINT  (communication  intelligence).  Sensor-based 
sources  typically  have  high  precision  regarding  location  and  time  of  an  observation  but 
are  susceptible  to  false  positive  and  false  negative  errors.  Human-based  sources, 
including  tips,  messages  and  communications  generated  by  humans  are  susceptible  to 
these  same  errors.  In  addition  to  possible  misinformation  regarding  the  description  of  a 
reported  event,  these  sources  also  tend  to  have  low  precision  regarding  the  location  and 
time  of  the  event. 

We  explore  several  methods  for  combining  information  from  sensor-based  and 
human-based  sources.  In  addition  to  the  traditional  Bayesian  update  mechanism,  which  is 
commonly  used  for  sensor  fusion,  we  also  examine  applying  Dempster-Shafer  theory. 
The  Bayes’  method  is  mathematically  rigorous  but  requires  a  number  of  assumptions  not 
needed  for  the  Dempster-Shafer  methods,  namely  assuming  that  the  distribution  of  the 
messages  received  from  the  informant  is  known,  and  uniform.  The  Dempster-Shafer 
theory  does  not  make  those  assumptions  explicitly.  Moreover,  there  are  several  ways  to 
implement  the  Dempster-Shafer  theory,  and  it  is  not  clear  in  advance  which 
implementation  would  be  most  appropriate  for  a  given  scenario.  We  compare  the 
methods  both  qualitatively  and  quantitatively  using  a  simulation. 

Our  analysis  shows  that  even  when  the  assumptions  of  the  Bayes’  update  process 
are  violated,  it  still  manages  to  yield  the  best  results.  The  Dempster-Shafer  methods  did 


xv 


not  perform  better  than  Bayes  even  though  they  do  not  explicitly  make  as  many 
assumptions  as  the  Bayes  update.  As  expected,  when  the  reliability  of  the  informant  is 
low  or  is  mistaken  to  be  low,  and  there  is  non-uniformity  in  the  way  he  produces 
messages,  all  the  methods  performed  poorly. 

In  addition  we  develop  a  Bayesian  model  to  assess  the  quality  of  the  informant 
and  update  a  vessel’s  location  simultaneously.  We  formulate  update  procedures  both 
when  the  informants’  messages  can  be  verified  and  when  they  cannot  be  verified  and  we 
must  rely  only  on  the  current  perception  about  the  location  of  the  vessel  and  the 
informant’s  reliability  for  the  update.  We  suggest  a  combined  scheme  that  allows 
simultaneous  estimation  of  both  the  location  of  the  vessel  and  the  reliability  of  the 
informant  as  new  information  becomes  available. 
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I.  INTRODUCTION 


A.  MOTIVATION 

Drug  addiction  in  the  United  States  generates  significant  health,  economic,  and 
social  costs.  According  to  the  National  Drug  Intelligence  Center  (NDIC)  report  from 
2011,  “In  2007  alone,  the  estimated  cost  of  illicit  drug  use  to  society  was  $193  billion, 
including  direct  and  indirect  public  costs  related  to  crime,  health,  and  productivity...  an 
increasing  number  of  individuals,  particularly  young  adults,  are  abusing  illicit  drugs.  In 
2009,  an  estimated  8.7  percent  of  Americans  aged  12  or  older  (21.8  million  individuals) 
were  current  illicit  drug  users”  (NDIC,  2011). 

One  of  the  primary  ways  drugs  arrive  to  the  United  States  is  via  smuggled 
maritime  shipments  from  South  America.  Most  of  the  marijuana  seized  and  more  than 
50%  of  the  methamphetamine,  cocaine  and  heroin  seized  are  detected  on  the  Southwest 
border  (NDIC,  2011).  Figure  1  illustrates  that  more  than  99%  of  the  cocaine  flow  from 
South  America  to  the  United  States  in  2007  was  smuggled  through  the  Caribbean  Sea  or 
Pacific  Ocean  via  Mexico  (Palter,  2009). 
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Figure  1.  Northward-bound  Cocaine  Flows  (From  Palter,  2009). 
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In  October  1989  the  Joint  Interagency  Task  Force  South  (JIATF-S)  was 
established  in  order  to  fight  these  traffickers.  JIATF-S  is  a  multiservice,  multiagency 
national  task  force  that  conducts  counter-illicit  trafficking  operations,  intelligence  fusion, 
and  multi-sensor  correlation  to  detect,  monitor,  and  hand  off  suspected  illicit  trafficking 
targets  (JIATF-S,  2013).  JIATF-S  collects  information  from  multiple  sources  with 
different  characteristics  and  of  different  quality.  The  types  of  information  collected  can 
be  broadly  classified  into  two  categories: 

1)  Sensor-based  intelligence:  Sensor  observations  are  typically  characterized 
by  high  precision  regarding  location  and  time  of  the  observation  but  are  also  susceptible 
to  false  positive  and  false  negative  errors  regarding  its  outcome.  Typical  examples  for 
sensor-based  observation  sources  are  electronic  intelligence  (ELINT),  electronic 
intelligence  obtained  from  sources  such  as  RADAR  and  non-content  signals  from 
communication  devices,  and  visual  intelligence  (VISINT),  visual  intelligence  such  as 
video,  images  and  the  naked  eye. 

2)  Human-based  intelligence:  Tips,  messages  and  communications  generated 
by  humans,  which  in  addition  to  errors  in  content  are  also  susceptible  to  low  precision 
regarding  the  location  and  time  of  the  event.  The  error  rate  depends  upon  the  reliability  of 
the  source,  which  is  less  well  defined  than  the  error  rates  of  sensors.  Typical  examples  of 
human-based  intelligence  (HUMINT)  are  intelligence  (such  as  tips  and  messages) 
gathered  from  human  sources,  and  communications  intelligence  (COMINT)  are  content- 
based  intelligence  gathered  from  intercepted  communications. 

A  main  challenge  for  JIATF-S  is  the  integration  of  information  about  different 
spatial  locations  and  time  ranges  from  multiple  sources  in  a  consistent  and  coherent 
manner  in  order  to  locate  illicit  drugs  trafficking  vessels. 

In  this  thesis  we  develop  data  fusion  techniques  to  assist  JIATF-S  in  estimating 
the  likelihood  that  a  certain  target  (i.e.,  a  drug-smuggling  vessel)  is  present  at  a  certain 
location  at  a  certain  time.  This  information  can  provide  JIATF-S  with  better  situational 
awareness  and  inform  decision  makers  about  where  they  choose  to  send  search  and 
interdiction  assets.  More  specifically,  we  provide  a  probability  distribution  for  the 
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location  and  departure  time  of  possible  targets.  We  also  evaluate  the  quality  of  the 
information  sources  in  order  to  give  their  inputs  the  proper  weight  (e.g.,  the  reliability  of 
a  human  informant). 

B.  CONTRIBUTIONS  OF  THIS  WORK 

Exploitation  of  human  intelligence  for  targeting  has  been  known  since  ancient 
times  and  discussed  by  many  strategists  and  military  theorists,  as  described  for  example 
in  the  famous  “The  Art  of  War”  written  by  Sun  Tzu  in  the  6th  century  BC.  Processing 
HUMINT  intelligence  into  spatial  information,  for  instance,  to  create  crime  maps  for 
policing  applications  is  also  known  (Ratcliffe,  2000) 

There  is  also  vast  literature  on  the  fusion  of  sensor-based  infonnation  (Hall  & 
Llinas,  2001);  however,  there  is  still  a  need  to  combine  human  intelligence  with 
intelligence  from  other  sources  -  a  process  that  it  is  traditionally  done  manually  by  trained 
professionals  due  to  the  difficulties  of  implementing  automated  algorithms. 

In  this  work  we  suggest  automated  algorithms  to  combine  human-based 
intelligence  and  sensor-based  intelligence  in  a  single  framework.  We  also  provide  a  way 
to  estimate  and  update  the  perceived  reliability  of  our  sources. 

C.  THE  INTELLIGENCE  PROCESS 

Since  this  thesis  relates  to  the  effective  utilization  of  intelligence,  it  is  useful  to 
frame  this  thesis  within  the  intelligence  processing  paradigm.  The  intelligence  process 
comprises  the  following  six  categories  of  intelligence  operations  (United  States  Dept,  of 
the  Army,  2007): 

•  Planning  and  direction  -  Planning  operations  to  acquire  new  or  better  data  or 
develop  intelligence  sources. 

•  Collection  -  Acquisition  of  the  required  data. 

•  Processing  and  exploitation  -  Converting  the  collected  raw  data  into 
information  that  can  be  used  by  commanders. 

•  Analysis  and  production  -  Analyzing  the  information  and  producing  higher- 
level  intelligence  from  the  information  gathered  via  interpretation  and 
integration  with  other  relevant  information. 
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•  Dissemination  and  integration  -  Disseminating  the  intelligence  to  appropriate 
users. 

•  Evaluation  and  feedback  -  Evaluation  of  the  intelligence  performance. 

The  following  figure  represents  graphically  the  intelligence  process: 
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Figure  2.  The  Intelligence  Process  (From  United  States  Dept,  of  the  Army,  2007). 


Since  the  evaluation  and  feedback  component  is  used  to  initialize  the  other 
intelligence  activities,  the  intelligence  process  is  also  referred  to  as  the  “intelligence 
cycle.”  In  this  thesis,  we  concentrate  on  the  processing  stage  and  examine  how  to 
effectively  integrate  new  pieces  of  information  into  the  intelligence  profile  using  data 
fusion  methods. 

Combining  information  from  different  data  sources  is  commonly  called  data 
fusion,  which  is  defined  as  a  “process  dealing  with  the  association,  correlation,  and 
combination  of  data  and  information  from  single  and  multiple  sources  to  achieve  refined 
position  and  identity  estimates,  and  complete  and  timely  assessments  of  situations  and 
threats  as  well  as  their  significance”  (White,  1991). 
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In  this  thesis,  we  concentrate  on  combining  the  two  types  of  inputs  -  sensor  data 
and  HUMINT  input  into  a  single  fused  intelligence  picture.  The  feedback  between  these 
two  sources  can  be  used,  in  turn,  to  reevaluate  the  quality  of  the  intelligence  obtained 
from  each  source,  thus  improving  the  fusion  of  future  intelligence.  The  following  figure 
illustrates,  in  simple  terms,  the  processing  problem  considered  in  this  thesis. 
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Figure  3.  Simplified  intelligence  process  model. 


D.  DATA  FUSION 

Data  fusion  may  relate  to  fusion  of  information  of  different  levels  and  for 
different  purposes.  In  order  to  encompass  and  categorize  the  different  levels,  the  Data 
Fusion  Subpanel  (which  later  became  known  as  the  Data  Fusion  Group)  of  the  Joint 
Directors  of  Laboratories  (JDL)  developed  the  Data  Fusion  model  in  1985  (Flail  & 
Llinas,  2001). 

The  JDL  model  as  revised  in  (Steinberg,  Bowman  &  White,  1999)  categorizes  the 
fusion  according  to  the  relation  of  the  information  to  the  entity  of  interest  (in  our  case  the 
drug  smuggling  vessel)  and  the  purpose  of  the  outcome  of  the  fusion.  The  following 
levels  are  included  in  the  model: 

•  Level  0  —  Sub-Object  Data  Assessment:  prediction  of  entities  that  are  not 
recognized  as  an  object  yet,  such  as  pixels  and  radio  signals. 

•  Level  1  —  Object  Assessment:  estimation  and  prediction  of  entity  states  on  the 
basis  of  inferences  from  observations. 

•  Level  2  —  Situation  Assessment:  estimation  and  prediction  of  entity  states  on  the 
basis  of  inferred  relations  among  entities. 
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•  Level  3  —  Impact  Assessment:  estimation  and  prediction  of  effects  on  situations 
of  planned  or  estimated/predicted  actions  by  the  participants. 

•  Level  4  —  Process  Refinement  (an  element  of  Resource  Management):  adaptive 
data  acquisition  and  processing  to  support  mission  objectives. 

The  flow  of  information,  from  raw  measurements  to  assessment  of  the  entire 

picture  with  regard  to  the  JDL  model  levels  is  described  in  the  following  figure: 
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Figure  4.  JDL  model  information  flow  (From  Steinberg,  Bowman  &  White  1999). 


In  this  thesis,  we  explore  levels  0  and  1  of  the  JDL  model  combining  raw  data 
information  from  multiple  sources  in  order  to  estimate  a  target’s  location  and  departure 
time.  Higher  data  fusion  levels,  such  as  assessing  the  characteristics  and  intentions  of 
multiple  targets  are  not  discussed  in  this  work. 

We  will  examine  two  information-fusion  approaches  in  this  thesis:  Bayesian 
update  and  Dempster-Shafer  Theory. 
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a. 


Bayesian  Update 


Bayes’  formula  was  first  introduced  in  the  18th  century  (Bayes,  1763),  and 
its  applications  are  found  in  many  fields  that  range  from  diagnosing  the  medical  situation 
of  patients  (Lincoln  &  Parker,  1967)  to  artificial  intelligence  (Korb  &  Nicholson,  2011). 

Many  tracking  and  location  algorithms  are  based  on  Bayesian  methods, 
such  as  the  well-known  Kalman  filter.  The  manuscript  “Bayesian  Filtering:  From  Kalman 
Filters  to  Particle  Filters,  and  Beyond”  (Chen,  2003)  includes  an  exhaustive  review  of 
filters  and  (Morelande,  Kreucher  &  Kastella  ,2007)  reviews  other  Bayesian  tracking 
algorithms. 

Bayesian  methods  are  also  applied  extensively  to  sensor  fusion.  The  book 
Handbook  of  Multisensor  Data  Fusion  (Hall  &  Llinas,  2001)  presents  an  overview  of 
data  fusion  methods  that  includes  several  chapters  regarding  Bayesian  updates  and  treats 
it  as  the  basic  method  for  data  fusion. 

In  this  work,  the  Bayesian  method  is  used  in  a  similar  manner  to  update 
the  probability  that  a  target  is  at  a  particular  location  at  a  certain  time. 

However,  Bayesian  methods  also  have  limitations.  In  particular,  they 
require  intimate  knowledge  of  sensor  capabilities,  such  as  estimates  of  the  error  rates,  a 
notion  of  the  distribution  of  the  possible  errors  of  the  sensor  and  assumptions  regarding 
the  state  of  the  world,  such  as  a  prior  distribution.  For  those  reasons,  we  also  consider 
other  information  updating  methods. 

b.  Dempster-Shafer  Belief  Method 

Dempster-Shafer  theory  (DST)  was  developed  by  Arthur  Dempster  and 
Glenn  Shafer  (Dempster,  1967;  Shafer,  1976).  This  theory,  which  is  in  some  sense  a 
generalization  of  probability  theory,  allows  for  assigning  “belief’  values  (and  not 
probabilities)  to  events  and  sets  of  events,  thus  requiring  fewer  assumptions  and  axioms. 

Due  to  the  theory’s  ability  to  deal  with  complicated  types  of  variability  in 
belief,  Dempster-Shafer  theory  has  been  used  widely  for  decision  making  algorithms  and 
data-fusion.  An  Introduction  to  Bayesian  and  Dempster-Shafer  Data  Fusion  (Koks  and 
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Challa,  2003)  is  a  good  introductory  summary  to  Dempster-Shafer  theory  in  comparison 
with  Bayesian  methods. 

Hall  and  Llinas  (2001  also  includes  several  chapters  about  Dempster-Shafer 
theory  while  Sentz  (2002)  is  an  extensive  report  about  different  Dempster-Shafer 
methods  and  their  applications. 

E.  OTHER  POSSIBLE  APPLICATIONS 

In  this  thesis  the  framework  suggested  for  locating  drug  traffickers  may  be 
applied  to  a  range  of  related  applications. 

1)  Locating  friendly  forces:  A  similar  problem  to  the  one  described  above  on 
the  sea  can  occur  on  land  as  well,  when  data  from  sensors  such  as  radars  are  combined 
with  information  from  human  sources  in  order  to  locate  a  friendly  force  in  need  of 
assistance  on  the  battlefield. 

2)  Other  types  of  sensors:  Traditional  updating  mechanisms  require  an 
intimate  knowledge  of  the  technical  parameters  that  determine  the  performance  of  a 
sensor  and  the  environment  in  which  it  is  used.  When  this  knowledge  is  lacking,  more 
robust  methods,  such  as  the  ones  explored  in  this  thesis,  can  be  of  use.  Those  methods 
can  be  applied  not  only  to  SIGINT  and  HUMINT,  but  also  to  other  types  of  intelligence. 

F.  THESIS  OUTLINE 

This  chapter  includes  the  background  and  problem  description.  Chapter  II 
describes  the  problem,  the  basic  integration  methods  used,  the  assumptions,  and  the 
details  of  the  Bayesian  update  and  Dempster-Shafer  belief  theory  and  their  application  to 
the  problem.  In  Chapter  III  we  compare  those  two  models  and  develop  a  simulation  to 
gain  additional  insights.  Chapters  IV  and  V  include  a  detailed  mathematical  framework 
of  extensions  to  the  base  model  described  in  Chapter  II.  While  Chapter  IV  includes  a 
model  that  handles  multiple  routes,  Chapter  V  includes  a  framework  that  allows 
estimating  and  updating  the  reliability  of  the  sources. 
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II.  THE  MODEL 


In  this  chapter  we  describe  the  scenario,  the  theater  of  operations  and  the  goals  of 
the  JIATF-S  operator.  We  define  the  different  types  of  intelligence  received  and  describe 
a  model  that  updates  the  situational  awareness  regarding  this  scenario  by  combining 
intelligence  from  different  sources  together. 

A.  SCENARIO 

We  consider  a  drug-smuggling  situation  from  the  northern  part  of  South  America 
to  Central  America.  Drug  smugglers  may  leave  from  multiple  points  of  embarkation  in 
the  northern  part  of  South  America  towards  one  of  multiple  final  destinations  in  Central 
America  and  Mexico.  The  smugglers  use  three  kinds  of  vessels  for  their  operations: 

•  GO-FAST  small  boats  -  designed  to  reach  high  velocities  but  with  relatively 
low  capacities, 

•  Merchant  Vessels  -  high  capacity  but  slow  and  easy  to  detect,  and 

•  SPSS  (self-propelled  semi-submersible)  are  partly  submersible.  These  vessels 
are  difficult  to  detect  by  radar,  but  their  velocity  is  slow. 

The  types  of  vessels  have  very  different  characteristics,  and  therefore,  it  is  usually 
easy  to  distinguish  among  them.  There  are  also  several  categories  of  typical  routes  the 
smugglers  use: 

•  Close  to  the  shore  in  the  Pacific  ocean, 

•  Close  to  the  shore  in  the  Caribbean  sea, 

•  In  the  Pacific  ocean,  via  Galapagos  Islands, 

•  Straight  routes  between  the  embarkation  point  and  the  destination,  and 

•  Piece-wise  linear  routes  between  the  embarkation  point  and  the  destination. 
(This  category  essentially  covers  all  possibilities). 

Figure  5  shows  examples  of  those  routes. 
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Figure  5.  Typical  smuggling  routes. 

JIATF-S  operators  desire  enhanced  situational  awareness  about  the  location  of 
targets  in  order  to  more  effectively  direct  interdicting  assets  that  can  seize  the  smuggled 
drugs.  JIATF-S  operators  increase  situational  awareness  by  obtaining  information  from 
sensors  such  as  radars  and  cameras,  as  well  as  from  human  sources.  Often,  new 
intelligence  arrives  in  real  time.  In  those  cases,  the  operators  should  update  their 
perceived  probability  that  a  vessel  is  located  in  a  specific  place  according  to  the  new 
intelligence. 

JIATF-S  use  their  situational  awareness  to  send  a  surveillance  aircraft  or  surface 
vessels  to  look  for  smugglers  in  the  suspected  location.  If  a  smuggler  is  positively 
identified,  the  surveillance  vehicle  holds  contact  with  the  smuggler’s  vessel  until  a 
maritime  force  boards  it  and  confiscates  the  drugs. 

The  operators  must  also  evaluate  the  quality  of  the  different  intelligence  sources 
in  order  to  weight  their  information  contributions  appropriately. 

B.  ASSUMPTIONS 

For  tractability,  we  initially  make  the  following  assumptions,  some  of  which  we 
relax  later.  These  assumptions  reduce  the  problem  to  estimating  a  single  parameter: 
departure  time. 


10 


1.  Single  Target  Vessel 

There  is  at  most  one  target  vessel  in  the  theater  at  any  given  time.  Although  in 
reality  multiple  vessels  might  be  present  in  the  theater  at  the  same  time,  it  is  assumed  that 
JIATF-S’s  operators  are  able  to  associate  incoming  information  with  the  correct  vessel. 
In  other  words,  in  this  thesis  we  do  not  consider  data-association  problems  that  might  be 
a  subject  for  future  research. 

2.  Constant  and  Known  Speed 

The  speed  of  the  vessel  is  constant  and  known.  This  assumption  is  reasonable 
because  the  speed  of  the  vessel  depends  mainly  on  the  vessel’s  type  (which  is  usually 
known),  and  may  not  change  much  during  the  course  of  its  movement.  Since  the  velocity 
of  each  vessel  type  is  known  and  its  variance  is  rather  small,  this  assumption  is 
reasonable.  Future  work  may  consider  variable  velocity  due  to  weather  conditions, 
refueling  stops,  strategic  considerations  or  other  factors. 

3.  Discrete  Departure  Time  Distribution 

The  set  of  departure  times  is  discrete  and  finite;  the  vessel  can  leave  the  harbor  at 
any  one  of  several  possible  time  slots.  As  the  discretization  can  be  as  fine  as  necessary, 
this  assumption  does  not  affect  the  results  of  the  model.  A  reasonable  size  of  a  time  slot 
would  be  one  to  three  hours. 

4.  A  Single  Known  Route 

To  start,  we  assume  there  is  only  one  possible  route  for  simplicity.  This 
assumption  is  relaxed  later  on  to  include  multiple  routes.  Since  the  route  and  the  speed  of 
the  vessel  are  known  and  fixed,  the  location  of  the  vessel  is  uniquely  defined  by  the  time 
of  departure. 

C.  DEFINITIONS 

A  random  variable,  Td ,  denotes  the  departure  time  of  the  vessel. 
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For  simplicity  we  assume  that  Tf  is  discrete,  and  its  possible  values  are  in  the  set 
T  =  so  there  are  n  possible  departure  times. 

The  probability  mass  function  of  the  departure  times  is  f(t}  =  P{Td  =  f).  We 

assume  a  prior  for  this  distribution,  and  our  objective  is  to  update  this  prior  as  new 
information  arrives.  As  more  intelligence  arrives,  the  posterior  should  narrow  around  a 
handful  of  most  likely  departure  times  to  aid  in  the  routing  of  surveillance  aircrafts. 

1.  Sensor-based  Intelligence  -  Observations 

An  observation  is  a  random  event  associated  with  a  certain  time,  t'  eT .  An 
example  of  an  observation  might  be  the  event  that  the  operator  received  a  radar  reading 
regarding  a  certain  time  and  location.  Since  we  assume  a  single  route  and  fixed  velocity, 
an  observation  that  is  made  at  any  location  along  the  route  can  be  trivially  translated  to  an 
observation  made  about  a  perceived  departure  time.  For  instance,  if  the  speed  is  30  knots, 
and  we  have  an  observation  at  distance  60  NM  on  the  route  at  4  p.m.  it  is  equivalent  to  an 
observed  departure  at  2  p.m.  This  allows  us  to  locate  the  ship  at  any  desired  time  after 
disembarkation. 

The  formal  definition  of  an  observation  is: 

O ,  -  a  positive  observation  (a  vessel  departed  at  time  t' ). 

O -  a  negative  observation  (no  departure  at  time  t' ). 

The  observations  may  be  subject  to  the  following  errors: 

P(Ot,+  |  Td  ^t'\  =  Pf  \  False  positive  error,  the  sensor  reports  a  departure  while 
there  is  none. 

P(Ot,  _\  Td  —  t'  j  =  Pf  :  False  negative  error,  the  sensor  fails  to  detect  a  departure. 

The  error  probabilities  depend  on  the  detection  and  classification  capabilities  of 
the  sensor  and  on  the  characteristics  of  the  environment.  In  particular,  the  probability  for 
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false  positive  error  depends  on  the  number  of  non-target  vessels  and  debris  in  the  area  of 
the  point  of  embarkation. 

2.  Human-based  Intelligence  -  Messages 

While  an  observation  is  associated  with  a  specific  time  of  departure,  messages  are 
less  specific  and  may  include  a  range  of  possible  departure  times.  An  example  of  a 
typical  message  may  be  an  infonnant  relaying,  “I’ve  heard  that  a  ship  might  embark 
between  8  a.m.  and  noon”  or  perhaps  getting  a  hint  via  telephone  communication  that 
“The  drug  dealers  will  leave  on  one  of  the  following  mornings  .  .  As  mentioned  above 
in  Section  B,  we  first  consider  only  the  time  ambiguity  and  assume  a  single  known 
embarkation  point.  This  assumption  is  relaxed  later. 

Let  M  denote  the  random  event  that  a  certain  message  is  received.  The  sample 
space  of  those  events  (the  possible  messages)  is  all  the  subsets  of  the  departure  times 
{tvt2,...,tn}  =  T  except  for  the  empty  set.  Let  A' denote  the  cardinality  of  the  message,  that 

is|M|  =  k.  Thus,  if  the  event  M occurred,  the  informant  claims  that  the  departure  time  is 
one  of  the  k  values  in  M . 


D.  BAYESIAN  UPDATE 

1.  The  Update  Process 

As  before,  f(ij  is  the  probability  mass  function  of  the  true  departure  time,  and 

P{Td  =  ?)  =  /(  /  )  is  the  probability  that  true  departure  time  is  t.  According  to  the  Bayes’ 
formula,  the  update  probability  with  new  information  is  defined  as: 


(Td 


=  1 1  New 


.  P{  New  information  I  T=t]-P.  { T, 

information  =  — - - — - ’ 

7  Pi  New  information] 


=  t 


(2.1) 


From  Equation  (2.1)  it  follows  that  in  order  to  calculate  the  updated  probability 
distribution  using  Bayes’  method,  one  requires  a  prior  probability  Pprior{Td  =  t j  =  fprior (7). 
This  probability  reflects  the  prior  infonnation  we  have  about  the  vessel’s  departure  time. 
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Absent  any  information  we  use  as  the  default  the  uniform  prior  f  .  ~U\T  \. 

J  J  prior  [_  J 

However,  if  we  have  some  general  information  about  the  likelihood  of  different  departure 
times  (for  instance,  if  we  know  that  the  likelihood  of  departing  at  low  tide  is  much  higher 
than  at  high  tide),  we  can  integrate  this  knowledge  by  altering  the  prior. 


Equation  (2.1)  can  be  rewritten  as  a  product  of  an  update  function  and  the  prior 
information  regarding  the  distribution:  _  (7)  =  fupdate  (/)  •  f  jor  (V) ,  with  the  update 

function  being: 


fupdaM 


P(  New  information  Td  =  ?  j 
P(  New  information) 


(2.2) 


In  the  following  chapters  we  show  how  to  define  the  update  function  for  different 
intelligence  types. 


2.  Bayesian  Update  Following  an  Observation 

As  described  before,  there  are  two  types  of  observations,  positive  and  negative. 
We  shall  calculate  the  update  function  for  each  of  those  cases. 


Applying  Equation  (2.2)  to  the  case  where  the  new  information  is  a  positive 
observation  at  time  t'  ,  and  using  the  law  of  total  probability  for  the  denominator,  the 
updated  probability  distribution  is: 


Ho, jrH 

seT 


(2.3) 


By  definition  of  false  positive  and  false  negative  errors,  the  probability  of 
receiving  a  positive  observation  regarding  time  is  1  -  P  if  the  true  departure  time  is 

indeed  t'  and  Pf+  otherwise: 


P(Ot,+  \Td=t)  = 


\-Pf_  t  =  t' 
Pf+  t*f 


(2.4) 
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And  so  we  can  calculate  the  total  update  function,  given  that  we  know  the  values 


of  P,  and  Pf  : 

_ 1  ~  Pf _ 

,  ,  J  (> ,  =  ‘ 

J  updated  \  J  jj 

_ !ji _  (2.5) 

Similarly,  we  can  follow  the  entire  calculation  for  the  case  of  receiving  a  negative 
observation,  and  we  get  the  following  update  function: 

a -/(''M'-aM1-/!'')) 

_ _  t^f  (2.6) 


3.  Bayesian  Update  Following  a  Message 

For  a  given  informant  and  a  message  of  size  k  we  define  cj k  to  be  the  probability 

that  the  message  is  correct:  qk  =  P[Td  eM  | |m|  =  That  is,  we  assume  that  the  quality 

of  the  informants  depends  only  on  the  size  of  the  message.  The  exact  mathematical 
definition  of  this  parameter  will  be  discussed  in  the  following  chapters. 

We  assume  that  qk  is  monotone  non-decreasing  in  k  and  qn  —  1  where  the 
informant  gives  the  entire  possible  set  of  departure  times  as  the  message.  An  additional 
assumption  here  is  that  qk  does  not  depend  on  the  content  of  the  message  but  only  on  its 
size  k. 

The  update  of  the  probability  distribution  of  the  random  variable  Tj  following  a 
new  message  m  is  done  in  a  similar  way  to  the  observation  case: 
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P(M\Td=t) 

Y.P{M\Td=s}f(s) 


Differentiating  between  the  case  when  “t  is  in  the  message”  and  “t  is  not  in  the  message’' 
and  using  the  law  of  total  probability,  we  obtain  that  the  update  function  is: 


f update  { 


_ P{M\T=t,tGM)-qk _ 

JjP{M\Td=s,seM\qk.f{s)  +  JjP{M\Td=s,siM)\\-qk).f{s) 

sgM  s&M 

_ P{M\T=t,t£M\{\-qk) _ 

YjP{M\Td=S,sGM)-qk-f{s)  +  YjP{M\TJ  =  s,s£M)\\-qk)-f{s) 


However,  in  order  to  calculate  the  value  of  f  date(t}  we  must  know  the  values  of 

the  expressions  P^M  \  Td  =  t,t  e  Af)  and  P^M  \  Td  =  t,t  <£  A/j  •  In  simple  words,  we  need 

to  know  the  probability  of  receiving  every  possible  message  given  every  possible 
departure  time.  Since  this  infonnation  is  practically  impossible  to  acquire,  some 
additional  assumptions  must  be  made. 

We  assume  that  the  probability  of  receiving  a  message  of  length  k  is  4,  and  that  all 
messages  of  a  certain  size  that  include  the  true  departure  time  are  equally  likely.  In  other 

words,  if  for  example  the  possible  departure  times  are  T  =  {^,4,4,4}  and  the  real 

departure  time  is  ti,  then  receiving  j 4,4  J  is  as  likely  as  receiving  {4,4  j  or  {4,4}  .In 
general,  since  the  number  of  messages  of  size  k  that  include  a  certain  departure  time  s  is 

I  n  '  I ,  the  conditional  probability  of  receiving  a  certain  message  that  includes  the 


true  departure  time  t  is: 


p[M\Td  =  t,teM\*^  =  k)  =  lk~r-^ 


(  „_i  ^ 
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Similarly,  there  are  ^  J  possible  messages  of  size  k  that  do  not  include  a 

certain  departure  time  t,  and  assuming  they  are  all  equally  likely  brings  us  to  the 
following  expression  for  the  probability  of  receiving  a  certain  message  given  that  it  does 
not  include  the  true  departure  time  t : 


p(M\Td  =  t,t  £M,\M\  =  k)  =  lk 


1 


n  —  1  1 
k 


(2.10) 


Combining  Equations  (2.9)  and  (2.10)  we  have  the  probability  of  receiving  the 
message  m  of  size  k\ 


p(M\Td=t,\M\  =  k)  = 


k-<h 


1 


’*  1  ^ 


n  —  1 
k- 1 


n  —  1  ! 


teM 


teM  (2.11) 


Thus,  the  denominator  of  the  update  function  (2.7)  becomes 


P(M)  =  ’£f(s)-ll-P(M\Td=s,  \M\  =  k) 

seT 


=  Hf(s)'lk^k 


1 


seM 


(  n- 1  ^ 
k- 1  ) 


1 


s&M 


f  n-l) 
k 


(2.12) 


Combining  the  numerator  (2.11)  and  the  denominator  (2.12)  of  the  update 
function,  we  have  the  following  function: 
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k'9k'(  n- 1  ^ 

_ U-lJ 

E/W’4-^-7 — 1-^+E/W-4-(1-^)t — 1 


p(M\Td  =  t\M\  =  k)  =  - 


'■^rjsn 

U-lJ 


r  „-n 


4-(i-^)-rJTT 

1  VI  —  1 

l  *  J 


(2.13) 


Z/W-W-7 - -T+' 

5eM  72  —  1  5 

l  *-l  J 


-  1  77-1 

-1  J  l  k  J 


Simplifying  this  expression  yields  the  following  update  function: 


f update^)  = 


Z/M 

seM 

i|+z  m 

K  s<tM 

n-k 

1  -q 

1  ‘  i 

n  -  k 

Z/M 

seM 

^  Z/M 

K  s<tM 

1  -q 

'V* 

tsM 


t  <£M 


(2.14) 


As  an  example,  if  the  prior  is  a  uniform  prior  function,  f  nor  (/  )  =  — ,  the  updated 

p"  1  w  n 

distribution  can  be  simplified  to  be: 


/  (f)  =  - 

J  new  \  ) 


k  qk  +  n-k  1  -qk  k 
n  k  n  n  -  k 

_ n  n  -  k _ _  ^  — 

z/W4+z/W-^’"_i 


t  eM 


(2.15) 


t  <£M 
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E.  DEMPSTER-SHAFER  BELIEF  THEORY 
1.  Background 

As  discussed  in  the  previous  section,  using  the  Bayesian  method  requires  us  to 
make  significant  assumptions  that  may  be  difficult  to  justify.  In  order  to  avoid  those 
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assumptions  and  allow  for  a  more  robust  update  mechanism,  we  explore  the  non- 
Bayesian  Dempster-Shafer  Theory  (DST)  methods. 

The  Dempster-Shafer  theory  defines  sets  of  possible  outcomes  (or  realizations)  to 
a  random  variable  similar  to  standard  probability  theory.  However,  unlike  the  definition 
of  a  probability  distribution  that  assigns  probabilities  to  exclusive  outcomes,  Dempster- 
Shafer  theory  is  more  general  and  assigns  “mass”  values  not  only  to  events  but  also  to 
sets  of  events.  This  allows  combining  pieces  of  information  in  a  more  flexible  way. 

In  the  following  paragraph,  the  basic  Dempster-Shafer  theory  is  defined 
mathematically,  following  Chapter  7.2.3  in  (Hall  &  Llinas,  2001).  Let  T  be  a  set  of 

mutually  exclusive  outcomes  of  an  experiment  (“frame  of  discernment”)  and  Q  =  2r  is 
the  power  set  of  T.  The  belief  method  assigns  a  “mass  of  evidence”  m  to  elements  in  Q . 

The  mass  of  evidence  allocation,  denoted  by  m,  obeys  the  following  rules: 


o 

II 

S 

(2.17) 

(A)>0,VA  e  Q 

(2.18) 

J^m(A)  =  l 

AeQ 

(2.19) 

Similar  to  probability,  the  mass  of  the  empty  set  is  0  (Equation  (2.17)),  it  is  larger 
than  0  (Equation  (2.18))  and  the  total  mass  sums  to  1  (Equation  (2.19)).  Unlike 

probability,  the  mass  can  be  defined  to  subsets  of  Q  =  2T  (Equation  (2.18)).  Intuitively  it 
can  be  viewed  as  similar  to  probability  theory,  but  when  a  mass  is  assigned  to  a  set,  the 
probability  can  still  “shift”  between  the  elements  in  the  set  when  new  information  is 
acquired. 

Now  we  define  useful  terms  of  Dempster-Shafer  theory: 
belief  {Bel): 

Bel  (A}  -  ^ 

(2.20) 


20 


The  belief  of  A  can  be  interpreted  as  the  mass  of  evidence  assigned  to  A  and  all  its 
subsets.  This  is  the  minimal  probability  that  is  already  assigned  to  A. 

And  plausibility  (Pi): 

Pl(A)  =  \-  X  m(B)  =  X  "'(B ) 

Ar\B=<f>  Ar\B*</)  (221) 

The  plausibility  is  the  mass  of  evidence  that  can  possibly  be  assigned  to  A  in  the 
future  (not  assigned  to  any  subsets  that  do  not  intersect  with  A). 

The  interval  [bel(A),  pl(A)\  can  serve  as  a  confidence  interval  for  A’s 
probability  (Hall  &  Llinas,  2001). 

As  an  example,  let’s  assume  we  have  one  observation  from  a  source,  stating: 
“With  probability  90%,  the  vessel  departed  at  t},  and  with  10%  the  vessel  could  have 
departed  at  any  time. 


In  this  case,  the  masses  are  distributed  as  follows: 

w(M)  =  0-9 

m(T)  =  0A 


(2.22) 


And  so  the  belief  and  plausibility  can  be  calculated  to  be: 


m 

Bel 

PI 

id 

0.9 

0.9 

0.9 

T 

0.1 

1 

1 

id 

0 

0 

0.1 

(2.23) 


This  is  a  very  simple  example,  but  we  can  already  see  that  as  expected, 
the  ’’confidence  interval”  for  ti  is  between  0.9  and  1  as  expected.  The  confidence  interval 
for  t2,  about  which  no  information  was  given,  is  also  calculated  to  be  [0,0.l]. 
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On  the  other  hand,  if  the  source  stated  that  “With  probability  90%,  I’m  sure  the 
vessel  departed  in  ti,  and  with  10%  probability  the  vessel  could  have  departed  at  any 
other  time. 


In  this  case,  the  masses  are  distributed  as  follows: 

m({f,})  =  0.9 

*(r-{d)  =  o.i 


(2.24) 


And  so  the  belief  and  plausibility  can  be  calculated  to  be: 


m 

Bel 

PI 

Id 

0.9 

0.9 

0.9 

Hd 

0.1 

0.1 

0.1 

W 

0 

0 

0.1 

(2.25) 


which  yields  different  results  as  now  the  confidence  value  for  r-jt)  j  is  [0.1,0.l].  The 

mass  given  to  this  set  is  0.1.  A  message  can  be  defined  in  a  similar  manner,  but  with 
regard  to  multiple  departure  times:  For  an  infonnant  that  says  that  the  departure  time  is 

one  of  {tvt2) ,  and  we  know  that  he  is  correct  only  0.9  of  the  time,  the  corresponding 
mass  assignment  should  be: 


m[{tvt2})  =  0.9 

m{T~{tvt2})  =  ^ 

And  the  belief  and  plausibility  in  this  case  are: 


m 

Bel 

PI 

fid 

0.9 

0.9 

0.9 

Id 

0 

0 

0.9 

!d 

0 

0 

0.9 

^ ~  iV'.l 

0.1 

0.1 

0.1 

Id 

0 

0 

0.1 

(2.26) 


(2.27) 
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2. 


Combination  Rules 


Now  that  we  have  defined  the  Dempster-Shafer  framework,  the  next  step  is  to 
discuss  how  masses  assigned  by  different  sources  should  be  combined  in  order  to  make 
sense  out  of  multiple  sources  of  information.  The  desired  outcome  of  such  a  combination 
of  mass  assignments  would  be  a  new  mass  assignment. 

Probabilistically,  if  we  have  the  probability  assigned  for  each  outcome  by 
multiple  sources  and  we  assume  independence,  the  combined  probability  distribution  can 
be  obtained  by  simply  multiplying  the  probabilities  assigned  by  different  sources.  In  the 
Dempster-Shafer  theory,  the  situation  is  more  complicated  and  thus  multiple  combination 
rules  are  suggested  with  slightly  different  characteristics.  The  different  combination  rules 
are  thoroughly  discussed  in  Sandia  Lab’s  report  (Sentz,  2002).  In  the  following 
paragraphs,  we  describe  a  few  of  the  prominent  rules  in  more  detail  with  examples. 


a.  Dempster-Shafer 

The  first  suggested  rule  is  the  Dempster-Shafer  rule.  Given  two  mass 
assignments  by  different  sources  m{  and  m2,  the  combined  mass  assignment  m,  ^of  a  set 

A  is  calculated  by  adding  up  the  multiplication  of  masses  for  all  the  sets  B ,  C  such  that 
their  intersection  is  A: 


Z  mi(sK(c) 

mX2{A)  =  — - ,A  *  $ 


(2.28) 


where  K  is  a  nonnalization  factor  that  accounts  for  the  conflict  -  all  the  pairs  of  sets  that 
have  empty  intersection  and  therefore  their  corresponding  mass  can  not  be  assigned  to 
any  set: 


K=  Z  mi(5K(c) 

BnC=<f,  (2.29) 
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As  an  example  assume  the  first  message  specifies  departure  times  \tvt2j 
with  probability  of  0.9  and  the  second  message  corresponds  to  times  out  of  5 

possible  departure  times  S  =  ^tvt2,tvt^,t^: 

m2  ( { tvh } )  =  °-9’  m2  ( { h  A  A } )  =  1  (2.30) 

The  combined  masses  can  be  calculated  by  determining  the  masses  of  the 

intersections: 


tvt2 

ii 

o 

to 

})  =  0.1 

m\  ( 

({',"'.}) 
It  t  t  ' 

II  O 

P  to 

mj 
m\,2  ( 

iw: 

:w: 

)  =  0.8 1  mxl 
)  =  0.09  ml  9  ( 

(«) 

=  0.09 

)  =  0.01  (2-31) 

In  this  case,  there  is  no  conflict  between  the  sources;  each  two  sets  with 
positive  masses  have  a  non-empty  intersection  (A=0). 

However  if  informants  submit  conflicting  messages,  the  situation  will  be 
different.  Let  us  see  what  happens  if  in  addition  to  the  infonnation  before,  the  first 
informant  is  positive  that  the  true  departure  time  is  not  t3,  and  therefore  he  does  not 
assign  any  masses  to  sets  that  include  /?,  resulting  in  the  following  mass  assignments: 

^2  } )  =  0-9’  (  {  ^4 ^5  }  )  =  0- 1 

^2  (  {  ^3  }  )  =  0-9’  "22  ({  ^2  ^4^5  }  )  =  0- 1  (2.32) 


m\{ 

ii 

o 

to 

,{wl 

)  =  0.1 

m2 

m{ 

({'."'ll) 

It  t  t  ' 

\  2’£4’J5j 

1  =  0.9 

|)  =  0.1 

mi2 

({'.i: 

({'=}; 

(  =  0.81 

)  =  0.09  ml ,  | 

K  =  0.09 

i<4.<5l)=°-°i  (233) 

Now  there  are  two  sets  in  the  different  assignments  that  have  an  empty 
intersection  (the  upper-right  cell  in  the  table).  This  mass  is  added  to  K,  that  measures  the 
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amount  of  “conflict.”  The  way  Dempster-Shafer  handles  this  situation  is  by  normalizing 
the  conflict  out,  eventually  assigning: 


0.81 

1-0.09 

0.09 


1-0.09 

0.01 


*  0.89 
*0.1 
-*0.01 


1-0.09 


(2.34) 


Although  this  method  may  work  well  when  the  conflict  is  small,  some 
paradoxes  arise  when  the  conflict  between  assignments  is  substantial.  For  example,  if  the 
first  infonnant  is  almost  certain  that  the  departure  time  is  f  but  it  might  be  t  ,  and  the 

second  infonnant  is  quite  sure  about  t  ,  then: 


m.(! 

II 

o 

to 

js 

*-+. 

OJ 

=  0.1 

m;(l 

/2  })  =  0-9’m2  ({r3}) 

1  =  0.1 

(2.35) 

II 

o 

to 

=  0-1 

to 

II 

o 

to 

K  =  0.81 

K  =  0.09 

II 

© 

K  =  0.09  / 

JS 

II 

o 

o 

(2.36) 

(t  U.Ul 

Paradoxically,  the  combined  assignment  will  bem12 j  =  - — ^-^-  =  1, 

disregarding  the  possibility  of  tl  or  t2 .  Resolving  this  paradox  is  one  of  the  main 
incentives  in  examining  other  combination  rules. 

This  rule  can  be  generalized  to  combine  more  than  two  messages: 


m 


1,2,. 


n  b,=a 


1  -K 


(2.37) 
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with  K  being: 


K  = 


E  mi(Bl)m2{B2)-'-m2 


n  b>= o 


(2.38) 


b.  Yager ’s  Modified  Dempster-Shafer  Rule 

This  method  is  similar  to  the  basic  Dempster-Shafer  rule,  with  one 
difference:  instead  of  nonnalizing  the  masses  m  by  1  -K,  K  is  added  to  the  mass  of  the 
entire  set  T.  The  mass  of  the  entire  set  T  can  be  interpreted  as  the  “ignorance,”  since  this 
mass  does  not  help  in  distinguishing  between  different  departure  times. 

We  now  revisit  the  high  conflict  example  defined  in  (2.35)  and  (2.36). 
Following  the  same  calculation,  the  final  assignment  would  be 

ml  -7  ( { } ) =  0.01, /wn(r)  =  0.99  with  confidence  intervals: 


m 

Bel 

PI 

W 

0.01 

0.01 

1 

Id 

0.99 

0.99 

0.99 

{  h  }  ’  {  ^2  } 

0 

0 

0.99 

Suggesting  that  since  the  conflict  is  so  large,  every  outcome  is  basically 
possible.  This  rule  can  also  be  extended  for  more  than  two  evidences,  but  it  is  not 
associative.  It  is,  however,  commutative  as  Equation  (2.37)  is  symmetric. 


c.  Zhang’s  Center  Combination  Rule 

This  rule  is  yet  another  extension  of  the  Dempster-Shafer  combination 
rule.  While  the  Dempster-Shafer  rule  does  not  account  for  the  intersection  between  two 

sets,  Zhang’s  rule  does  by  multiplying  the  assigned  outcome  mass  by  a  metric  as 

logically  the  mass  assigned  to  the  intersection  of  B  and  C  should  increase  with  its  size.  A 
common  metric  is  the  cardinality  of  the  intersection: 


r 


BnC 

b\-\c\ 
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(2.40) 


and  the  combined  mass  is: 


mv{A)  =  k  Z  r(B>C)mi(B)m2(C) 

BnC=A  (2.41) 

with  k  a  normalization  factor  such  that  m,  ,,  ( A )  =  1,  (not  the  same  as  AT  that 

AqS 

accounted  for  the  conflict  in  Dempster-Shafer  and  Yager’s  rules). 


To  see  the  difference  between  Zhang’s  rule  and  the  regular  Dempster-Shafer 
combination  rule,  let’s  look  at  the  case  where  we  have  a  positive  observation  regarding 
time  t{  and  message  about  times  tx  or  f  . 


mi({^i}) =  0-9’mi({^2’^3’^4’^})  =  0-1 

/W2({^2})  =  °-9>W2({W5})  =  0-1 


As  before,  we  calculate  the  masses  of  the  intersections: 


mj 

(Id) 

1  =  0.9 

m\  ( \ 

i  t  t  t  t  ' 

4’  5J 

!)  =  »'! 

™2({v2}) 

1  =  0.9 

ml2\ 

(Id) 

1  =  0.81 

m 

«({«.})“ 

0.09 

i)=0.1 

K  =  0.09 

wi,2l 

(Avd) 

1  =  0.01 

So  the  intervals  for  the  departure  times  are: 


m 

Bel 

PI 

(d 

0.89 

0.89 

0.89 

Id 

0.1 

0.99 

0.99 

K’dpd} 

0.01 

0.01 

0.01 

(2.44) 


However,  with  Zhang’s  rule  we 


also  calculate  the  value  of 


BnC 

\b\-\c\ 


for 


each  intersection: 
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(2.45) 


J3 

II 

o 

to 

=  O’1 

m2  ( {  ^1  ’ ^2  } )  =  °’9 

"v(M)=0-81’r=^ 

Wl,2({^})  =  0-09’r  =  ^ 

m2({W5})  =  0A 

- 

wi,2(k’V5})=0-01’r=^ 

And  by  using  Equation  (2.41),  the  final  combined  masses  according  to  Zhang’s 
rule  are: 


m 

Bel 

PI 

Id 

0.967 

0.967 

0.967 

Id 

0.027 

0.027 

0.027 

{W-. s} 

0.006 

0.006 

0.006 

(2.46) 


Since  the  size  of  the  intersection  between  j  and  m2  ( j  tvt2)j  is 

much  bigger  than  the  others  (which  can  be  interpreted  as  a  better  agreement),  ,  ( { ^  } ) 

receives  a  higher  mass  in  Zhang’s  rule  in  comparison  with  Dempster-Shafer’s 
combination  rule. 


A  conflict  between  assignments  is  resolved  by  the  normalization 
coefficient  k  similar  to  Demspter-Shafer’s  combination  rule. 


d.  Mass  Mean 

This  rule  of  combination  is  the  most  straightforward  one.  According  to  the 
mass  average  rule,  the  mass  of  a  set  in  the  combined  mass  assignment  is  simply  the 
average  of  masses  the  set  received: 

mi,2  „.,n{A)  =  k^mi{A) 

(2.47) 

Weighting  the  average  according  to  some  measure  of  confidence  or 
reliability  is  also  common,  but  this  issue  is  not  discussed  in  this  thesis. 


28 


3.  Observations  and  Messages 

Defining  observations  and  messages  according  to  Dempster-Shafer  scheme  is  not 
straightforward.  We  define  these  terms  as  follows: 

We  translate  a  positive  observation  with  “correctness”  q  regarding  time  t,  into  a 
mass  assignment  of: 

m{{t])  =  q,m[T-{t])  =  \-q 

(2.48) 

This  means  that  according  to  this  observation,  with  probability  q  the  true 
departure  time  is  tt  and  with  probability  1-q  it  is  any  other  time. 

Similarly,  a  message  M  with  “correctness”  qk  (|m|  =  k )  can  be  translated  to  the 
following  mass  assignment: 

m ( M)  =  qk t , m ( T  -  M)  =  1  -  qk 

(2.49) 

In  this  case  the  true  departure  time  is  included  in  the  message  M  with  probability 
qk  ,  and  with  probability  1-qk  it  is  not  in  the  message. 

4.  Transforming  Dempster-Shafer  Belief  and  Plausibility  Measures  to 
Probability  Values 

In  order  to  support  decision  makers  we  must  have  some  well-defined  probability 
about  the  departure  time  of  the  target.  This  is  straightforward  with  the  Bayesian  method, 
but  the  Dempster-Shafer  theory  only  allows  us  to  obtain  Belief-Plausibility  intervals.  The 
following  subsection  describes  a  few  methods  of  translating  the  belief  function  (or  mass 
function)  to  a  probability  value. 

a.  Pignistic  Transformation 

The  most  popular  transformation  is  the  Pignistic  Transfonnation  (Smets, 
1990)  that  distributes  the  mass  assigned  to  a  set  uniformly  among  all  the  members  of  the 
set: 
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(2.50) 


where  P  j/J  is  the  estimated  probability  of  departure  time  t  by  this  transformation.  This 

transformation  is  intuitive,  when  we  are  given  the  mass  of  a  subset  and  we  are  required  to 
estimate  the  probability  of  each  member  of  the  set,  a  natural  assumption  is  that  the 
probability  of  all  the  members  is  equal. 


Let’s  examine  the  outcome  of  this  transfonnation  in  the  case  of  a  single  message: 


m[{tvtvts})  =  ()A 

And  the  Belief  and  Plausibility  in  this  case: 


m 

Bel 

PI 

0.9 

0.9 

0.9 

w 

0 

0 

0.9 

w 

0 

0 

0.9 

0.1 

0.1 

0.1 

M 

0 

0 

0.1 

Then  the  pignistic  probability  of  each  time  is: 
P  k)  =  P.  (f2)  =  —  =  0.45 

P  (0  =  —  =  P  ( tA]  =  P .  0.033 

re\  U  2  P’s  V  4 /  pig  \  5 / 


(2.51) 


(2.52) 


Smets  (2002)  claims  that  the  pignistic  is  the  transformation  adequate  for 
making  decisions;  however,  it  does  not  necessarily  represent  your  belief: 


At  the  creedal  level,  beliefs  are  represented  by  a  belief  function;  at  the 
pignistic  level,  this  belief  function  induces  a  probability  function  that  is 
used  to  make  decisions.  This  probability  function  should  not  be 
understood  as  representing  your  beliefs,  it  is  nothing  but  the  additive 
measure  needed  to  make  decision,  i.e.,  to  compute  the  expected  utilities. 
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However,  one  of  the  principles  of  Dempster-Shafer  theory  is  that  with 
further  information  the  mass  can  shift  within  the  set.  This  transfonnation,  however 
intuitive,  disregards  this  flexibility  by  dividing  the  mass  equally  among  the  members  of 
the  set  (Cobb  &  Shenoy,  2003). 


b.  Plausibility  Transformation 

Another  transformation  is  the  Plausibility  transformation,  which  is 
basically  a  nonnalization  of  the  plausibility  function  of  the  singletons. 


with  K  being  the  nonnalization  factor: 

*=X^(Mi*en) 


(2.53) 


(2.54) 


This  transfonnation  tries  to  keep  the  essence  of  Dempster-Shafer  theory  by 
considering  the  plausibility  value  of  the  departure  times  (that  can  be  interpreted  as  the 
potential,  or  the  biggest  mass  that  can  possibly  be  assigned  to  it  if  the  right  infonnation  is 
received)  and  normalizing  those  plausibilities.  Looking  at  the  same  example  as  in  the 
previous  subsection,  defined  by  Equation  (2.51),  we  obtain: 

Zp/(0=0-9+a9+0-1+0-1+0-1=2-1 

1  <i<n 

l(0 =Lk)=fra42 

(2.55) 

pr,(h)= pM= It™47 


The  probability  assigned  to  ti  and  t2  is  lower  now,  to  account  for  the  fact 
that  there  are  three  other  departure  times  that  are  plausible. 

c.  Belief  Transformation 

A  less  useful  transfonnation  is  the  Belief  transfonnation  that  normalizes 
the  beliefs  of  the  singletons  but  disregards  information  about  non-singletons.  This 
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transformation  will  not  be  considered  in  this  work  (and  in  the  example  discussed  it  does 
not  make  any  sense). 


F.  DISCUSSION 

We  have  suggested  two  different  probability  update  models:  The  Bayesian  update 
and  Dempster-Shafer  Theory.  Both  models  allow  updating  observation  and  messages. 
The  Bayesian  method  is  well  known,  popular  and  mathematically  rigorous.  However,  it 
requires  multiple  assumptions  that  are  often  difficult  to  justify. 

The  Demspter-Shafer  Theory  includes  multiple  combination  rules.  There  is  no 
clear  method  that  is  considered  appropriate  for  all  situations  According  to  our  mass 
assignments  as  defined  in  (2.24)  and  (2.26)  we  do  not  expect  to  have  any  conflict  since 
all  the  possible  departure  times  are  members  of  sets  that  have  a  positive  mass,  and 
therefore,  Dempster-Shafer  and  Zhang’s  rules  are  reliable  and  justifiable  (Sentz,  2002). 
Yager’s  rule  is  useful  whenever  there  is  large  conflict;  however  in  our  case,  since  there  is 
no  conflict,  it  is  no  different  than  Dempster-Shafer’s  rule.  The  mean  combination  rule 
may  not  be  appropriate  when  averaging  extremes,  but  it  is  easy  to  compute  and  might 
provide  satisfactory  results  in  certain  cases,  and  therefore,  is  also  of  interest. 

The  transformation  step  from  Dempster-Shafer  theory  measures  to  probabilities  is 
also  possible  via  several  distinct  methods.  Although  the  pignistic  method  is  more 
common  in  the  literature  the  Plausibility  method  has  some  appealing  characteristics,  and 
it  is  more  consistent  with  Dempster-Shafer  theory  (Cobb  &  Shenoy,  2003). 

In  the  following  chapter  we  will  compare  Dempster-Shafer,  Zhang’s  and  the  mean 
methods,  each  of  those  transfonned  into  probabilities  by  the  two  transformations  models 
and  the  Bayesian  approach. 
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III.  BAYES  AND  DEMPSTER  SHEFER  THEORY  COMPARISON 


In  this  chapter,  we  compare  the  Bayesian  update  process  and  Dempster-Shafer 
methods.  We  first  discuss  the  qualitative  pros  and  cons  of  each  method  and  show  the 
equivalence  between  the  two  in  certain  situations.  We  conclude  with  a  section  describing 
the  results  from  a  simulation  experiment. 

A.  QUALITATIVE  COMPARISON 

As  discussed  in  the  previous  chapter,  the  Bayes’  method  requires  a  prior 
probability  distribution  fimor  (t) .  Dempster-Shafer  theory  does  not  require  a  prior  for 

computing  the  update  distribution  although  it  can  incorporate  such  a  prior  as  an 
additional  piece  of  information.  The  biggest  advantage  of  Dempster-Shafer  theory  in  our 
context  is  that  it  does  not  require  one  to  specify  the  probabilities  of  receiving  a  message 

given  the  actual  departure  time:  P^M  \  Td  =  t,t  eAf)  and  P^M  \  Td  -  t,t  <£  Af) .  Data  to 

estimate  these  conditional  probabilities  may  not  be  available  and  so  assumptions  that  are 
difficult  to  justify  have  to  be  made  to  obtain  them.  However,  once  we  impose  these 
assumptions,  calculating  the  updated  probabilities  is  straightforward  using  Bayes’ 
theorem,  and  the  result  of  the  update  is  unique.  Dempster-Shafer  theory  may  utilize  any 
one  of  multiple  combination  rules,  and  those  rules  may  produce  very  different  results 
depending  on  the  agreement  between  the  different  pieces  of  infonnation  received. 

The  output  of  the  Bayesian  update  gives  us  the  estimated  probability  that  a  given 
departure  time  is  in  fact  the  true  one.  The  Dempster-Shafer  theory  output  is  a  distribution 
of  masses  that  allows  us  to  calculate  Belief-Plausibility  confidence  intervals  regarding  the 
time  departure.  Those  intervals  give  some  insight  about  the  probability  of  the  vessel 
departing  at  a  certain  time  but  in  order  to  make  decisions,  and  in  particular  when  those 
intervals  are  large,  an  additional  transformation  is  required  to  obtain  probabilities.  This 
transformation  results  in  a  degradation  of  the  flexibility  that  makes  Dempster-Shafer 
theory  so  appealing. 
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Computation-wise,  the  Dempster-Shafer  theory  machinery  is  much  more 
intensive  since  each  member  in  the  power  set  of  T  may  be  assigned  a  mass.  A  Bayesian 
update  assigns  probabilities  only  to  the  members  of  the  set  itself,  not  its  power  set. 
However,  the  Bayes’  method  may  incur  additional  computational  costs  if  calculating 

P^M\Td  =  t,t  El)  and  P^M  \  Td  =  t,t  £M)  from  existing  data  is  difficult.  The 
following  table  summarizes  the  main  differences  between  the  Bayes’  updating  process 


and  the  Dempster-Shafer  theory: 


Bayes’  Method 

Dempster-Shafer  Theory 

Prior 

A  prior  distribution  of  the 
outcomes  is  required 

Prior  distribution  not  required 

Event 

distribution 

The  probability  of  receiving 
each  message  in  every  state 
of  the  world  is  required 

Only  mass  distribution  is  required 

Combination 

Rules 

Bayes’  Formula 

Several  different  combination 
rules 

Output 

Probability  distribution  of 
outcomes 

Results  in  Belief  and  Plausibility 
of  outcomes  and  sets  of 
outcomes.  Requires 
transformation  to  obtain 
probabilities 

Computation 

Computationally  easy, 
assigns  probability  values  to 
the  members  of  T. 

Computationally  intensive, 
requires  assigning  values  to 
members  of  the  power-set  of  T. 

Table  1.  Bayesian  update  -  DST  comparison. 


B.  BAYESIAN  UPDATE  -  DEMPSTER  SHAFER  ZHANG  EQUIVALENCE 

As  we  saw  in  Chapter  II,  the  update  process  can  be  performed  using  multiple 
methods.  In  this  section  we  show  that  the  Bayesian  update  and  the  Dempster-Shafer 
Zhang  method  with  a  pignistic  transformation  produce  the  same  probabilities  under  the 
model  assumptions  described  in  Chapter  II.  As  stated,  for  the  Bayesian  model  we  assume 
that  (a)  qk  is  known,  (b)  the  probabilities  of  receiving  true  messages  of  a  certain  size  are 
equal  and  the  probabilities  of  receiving  false  messages  of  the  same  size  are  also  equal, 
and  (c)  the  Bayes’  prior  is  a  uniform  distribution.  The  Zhang  combination  method  is 
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performed  as  described  in  Chapter  II,  Equation  (2.41).  Assuming  that  (a )q  is  known  and 
(b)  receiving  message  of  size  k,  yields  the  following  mass  assignment: 


m(Mk)  =  qk 
m(r-Mk)  =  \-qk 


(3.1) 


In  order  to  show  the  Bayes  -  Zhang  pignistic  equivalence  we  calculate  the 
probabilities  assigned  to  a  certain  departure  time  t  by  both  methods  after  N  messages 
are  received.  We  assume  that  the  informant  included  the  specific  departure  time  t  in 
exactly  N.  messages  out  of  the  N  received,  and  N  =  N  -  N.  . 

J  in  °  out  in 


1.  Bayesian  Update 


Recall  from  Equation  2.14  the  update  equation  when  the  message  size  is  k  : 


f update  W  = 


CA 

k 


Z/M4+I/M 


seM 


S&M 

l~  Ch¬ 
it  -  k 
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n-k 


tsM 


(3.2) 


I/Mi+I/M 


seM 


S&M 


]-Cjk 
n  -  k 


t  <tM 


where  is  the  probability  distribution  before  the  update.  Since  the  denominator  of 

the  update  function  is  merely  a  nonnalization  coefficient,  we  can  state  that  without 
nonnalizing,  the  updated  distribution  is: 


A 


n  -  k 


teM 
t  £M 


(3.3) 
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Applying  Bayes’  theorem  and  using  the  fact  that  the  message  probability  depends 
only  on  the  true  departure  time  and  qk ,  and  that  the  messages  are  independent  given  the 
departure  time,  allows  us  to  formulate  the  update  function  after  N  messages: 


(Td=t\M\,Ml,...,MNk)  = 


p\ 
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ii 
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(3-4) 


Since  P(Mk  \  Td-t^jcc—  for  messages  that  include  t  and  P(  AT,  \  Td  =  oc 

k 


H 

n-k 


for  messages  that  do  not  include  t,  the  update  function  after  N  messages  that  include  t 


and  N  that  do  not  include  t  is: 


4 


(1  -qLY- 

\^t) 


(3.5) 


This  update  function  can  later  be  nonnalized,  although  it  is  not  required  for 
proving  the  equivalence  to  the  Pignistic  Zhang  method. 


2.  Pignistic  Zhang  formulation 

Recall  the  Zhang’s  combination  rule  from  Equations  (2.40)  and  (2.41): 

/  \  /  \  /  ,  \A  B 

mh2(c)  =  k  2^  mi(A)m2(B)1——^ 

AnB=C  \A\\B\  (3.6) 

The  specific  departure  time  t  is  included  in  Mk  in  N  of  the  messages,  and 
included  in  T-Mk  in  N  of  the  messages.  In  order  to  eventually  calculate  the  probability 
of  departure  at  time  t  let  us  first  look  at  the  intersection  of  the  sets  that  include  t  for  all  the 
messages,  C: 

C  =  MlnM;n...n M?*  n { T - Mf - +1Jn...n|r -Mf } 
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(3.7) 


Since  for  every  message  Mk,  t  is  included  either  in  Mk  or  in  j  T  -  Mk } ,  the  sum  in 

Equation  (3.6)  is  reduced  to  a  single  tenn  and  C  can  be  interpreted  as  the  set  that 
includes  t  after  the  combination  of  all  the  messages  received.  The  mass  of  C  is: 


^ combined  (c^fKKMl  mt[{T-MJk 


|c| 

n 

M[ 

n 

{ T-M>) 

ileMk  j-tm{ 


(3.8) 


We  know  that  the  mass  assignment  is  m  ( M k  j  =  qk ,  m  ( T  -  M k )  =  1  -  qk  and  that 
the  size  of  a  sets  are  | Mk |  =  k  and  |  { T  -  Mk  }|  =  n  -  k .  By  substituting  those  expressions  in 
Equation  (3.8)  we  have: 


171  combined  (^)  00 


-qt 

kN"  ■  (n  -  k)A' 


(3.9) 


Once  we  have  the  combined  mass  assignment,  we  can  calculate  the  pignistic 
probability  of  time  t  according  to  Equation  (2.50): 


M=£ 


m(C) 


C  <=Q.s.t.t  eC 


kN>n  •(«  -  k)N'm‘  |  C 


kNin  •(«  -  k )Af 


(3.10) 


which  equates  to  the  expression  achieved  via  the  Bayesian  update  in  Equation  (3.5). 

C.  SIMULATION 

We  construct  a  simulation  experiment,  built  on  the  simulation  described  in 
(Martin,  2009)  and  implemented  in  MATLAB,  to  further  compare  the  Bayesian  and 
Dempster-Shafer  methods.  Using  the  simulation  we  study  the  process  of  combining 
pieces  of  evidence.  The  simulation  mimics  the  production  of  different  messages  and  its 
main  output  is  the  distribution  specifying  the  probability  the  target  left  at  any  particulate 
departure  time  (probability  distribution  over  T).  We  construct  the  simulation  in  two  parts: 
1)  generating  the  stream  of  messages  that  represents  the  state  of  the  world  and  it  does  not 
depend  on  the  update  mechanism,  and  2)  updating  the  departure  time  distribution  using 
different  methods  described  thus  far.  For  a  fixed  number  of  messages  received,  we  run 
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the  simulation  multiple  times  and  calculate  the  fraction  of  time  the  correct  departure  time 
has  the  highest  probability  after  perfonning  the  updates  with  a  certain  method. 


1.  Generating  Messages 

First,  the  simulation  generates  a  stream  of  messages  of  size  k  out  of  n  possible 
departure  times,  assuming  the  informant’s  reliability  is  qf  .  That  is,  on  average,  a  fraction 

qk  of  the  messages  include  the  true  departure  time.  Next,  we  examine  the  update 

methods  on  different  streams  of  messages.  We  examine  streams  of  messages  that  are 
generated  both  according  to  the  assumption  that  all  true  messages  are  equally  likely  (see 
Chapter  II),  and  when  this  assumption  is  relaxed.  We  define  an  input  parameter 
NU  e[0,l)  to  describe  the  measure  of  non-unifonnity  -  NU  =  0  implies  that  the 
messages  are  created  uniformly,  and  as  NU  increases  to  1,  the  non-unifonnity  of 
messages  also  increases.  The  exact  effect  of  this  parameter  is  described  in  the  following 
paragraph. 

We  assume,  without  loss  of  generality  that  the  true  departure  time  Td  is  the  last 
one  possible:  T  =  n . 

The  method  for  generating  the  messages  of  size  k  proceeds  as  follows: 

•  First,  we  determine  whether  the  message  includes  the  true  departure  time.  We 
do  this  by  generating  a  random  Bernoulli  variable  with  parameter  qk . 

•  If  the  message  is  true,  we  include  the  true  departure  time  in  it.  If  it  is  false,  we 
make  sure  that  it  does  not  include  the  true  departure  time.  Next,  we  populate 
the  rest  of  the  message  with  departure  times: 

•  If  NU  =  0,  the  rest  of  the  departure  times  are  picked  unifonnly,  that  is,  each 
of  the  possible  times  have  the  same  probability  of  being  included  in  the 
message. 

•  If  NU>0,  a  random  departure  time  t  is  drawn  from  a  geometric  distribution 
with  a  parameter  NU .  If  t  efl, ..,«],  t  is  not  the  true  departure  time  and  t  is 
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not  included  in  the  message  already,  t  is  added  to  message.  Else,  t  is  drawn 
again.  This  process  repeats  itself  until  the  message  is  fdled  up  with  k 
departure  times.  As  an  example,  let  us  look  at  the  probability  of  picking 
different  departure  times  for  some  values  of  NU  when  n  =  9 ,  as  depicted  in 
Figure  6: 


Departure  time 


Figure  6.  Probability  of  populating  the  message  with  departure  times  for  different  NU 

values. 

The  bigger  NU,  the  more  likely  smaller  values  will  populate  the  message.  In  other 
words,  messages  with  small  values  of  t  will  be  more  likely  to  be  generated.  Once  the 
messages  are  created,  the  different  update  methods  are  used  in  order  to  estimate  the  true 
departure  time. 

2.  Estimating  the  Probabilities  of  the  Departure  Times 

We  use  six  different  methods  to  generate  the  combined  probabilities: 
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1 .  Bayesian  method,  with  a  uniform  prior  and  an  updating  process  that  assumes  that 
all  true  message  of  a  certain  size  are  equally  likely,  and  the  same  is  true  for  false 
messages.  The  update  process  uses  this  assumption,  but  the  messages  generated 
might  not  be  drawn  according  to  this  assumption  if  NU>0. 

2.  Dampster-Shafer  rule,  where  the  combined  probability  is  derived  from  the 
pignistic  method. 

3.  Dampster-Shafer  rule,  where  the  resulting  probability  is  derived  from  the 
plausibility  method. 

4.  Zhang’s  rule,  where  the  resulting  probability  is  derived  from  the  plausibility 
method. 

5.  Mean  rule  with  a  pignistic  transformation. 

6.  Mean  rule  with  a  plausibility  transformation. 

Each  of  these  methods  is  applied  according  to  the  description  in  Chapter  II, 
assuming  that  there  is  an  estimate  of  the  informant’s  reliability  q  .  Note  that  this 

parameter  does  not  have  to  equate  to  the  true  informant’s  reliability  qk ,  since  the 
estimation  of  the  informant’s  reliability  might  not  be  correct.  Different  values  of  qe  and 
qk  will  be  tested  in  the  simulation 

Since  the  messages  are  not  necessarily  created  uniformly  the  update  processes 
might  use  the  wrong  distributional  assumptions.  However,  we  are  interested  in 
examining  how  well  Bayes’  update  method  performs  even  when  it  uses  the  wrong 
distributions  for  its  updating  process  in  comparison  with  Dempster-Shafer  methods. 
While  the  Dempster-Shafer  methods  have  less  explicit  assumptions  than  the  Bayesian 
update  approach,  the  combination  rules  are  somewhat  arbitrary  and  may  have  hidden 
assumptions  that  are  manifested  in  the  different  combination  rule. 

3.  Constructing  the  Results 

We  focus  on  two  measures  of  effectiveness  (MOE)  in  our  analysis:  (1)  the 

average  probability  assigned  to  the  true  departure  time,  and  (2)  the  percent  of  the  runs  in 
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which  the  true  departure  time  has  the  highest  probability.  After  each  simulation  run  we 
record  the  probability  specified  for  the  true  departure  time  for  each  of  the  updating 
methods.  We  also  tabulate  for  each  method  whether  the  true  departure  time  has  the 
highest  final  probability  associated  with  it.  We  do  this  because  if  the  decision  maker 
needs  to  take  action,  he  would  select  the  time  with  the  highest  probability.  After  many 
runs  of  the  simulation  we  can  calculate  the  average  probability  assigned  to  the  true 
departure  time  as  a  function  of  the  number  of  messages  received  and  the  percent  of  the 
runs  in  which  the  true  departure  time  has  the  highest  probability.  We  choose  the  latter  as 
our  main  MOE.  We  also  calculate  the  standard  deviation  of  this  MOE  across  the  runs 
conducted. 

4.  Input  Parameters 

The  input  parameters  of  the  simulation  are  as  follows: 

1.  Number  of  possible  departure  times  (cardinality  of  T )  -  n 

2.  Size  of  each  message  -  k 

3.  The  true  departure  time  -  td  (without  loss  of  generality  it  is  fixed  to  be  the  latest 
possible  departure  time  -  n). 

4.  The  true  value  of  the  probability  that  a  message  contains  the  true  departure  time  - 
qk .  This  parameter  controls  the  message  generation  process  and  is  not  known  to 

the  operator.  qk  can  be  interpreted  as  the  “true  reliability”  of  the  infonnant. 

5.  The  estimated  probability  that  a  message  is  true  -  qe.  q  can  be  interpreted  as  the 

estimated  reliability  of  the  infonnant  by  the  operator.  The  updating  process 
requires  an  estimate  of  this  probability,  and  we  assume  that  the  operator  knows 
this  estimation  ahead  of  time.  (It  is  an  input  to  the  simulation)  We  assume  this 
value  is  constant  but  does  not  have  to  be  equal  to  qk  -  as  happens  when  the 

operator  does  not  estimate  conectly  the  reliability  of  the  source.  This  allows  us  to 
generate  a  stream  of  messages  that  differs  from  the  Bayes’  and  Dempster-Shafer 
theory  update  assumptions. 
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6.  Non-uniformity  parameter  NU  —  controls  the  non-unifonnity  behavior  of  the 
messages  produced  between  0  and  1 . 

7.  Number  of  messages  constructed  in  the  simulation. 

8.  Number  of  runs  for  each  set  of  parameters. 

5.  Design  of  Experiments 

We  have  conducted  multiple  runs  of  different  scenarios  where  the  number  of 
possible  departure  times  is  n  =  9(=|jj)  and  the  size  of  the  message  is  k-3 .  The 

departure  time  is  fixed  to  be  T  =  n  =  9 . 

We  construct  a  full  design  with  the  parameter  values  stated  in  Table  2: 


Parameter 

Values 

kf  k 

0.4,0.7,0.95 

Ve 

0.4,0.7,0.95 

NU 

0,0.2, 0.4,1 

Table  2.  Parameter  values. 


Note  that  if  the  informant  is  clueless  and  thus  chooses  the  departure  times  in  the 
message  totally  randomly  (i.e.,  the  informant  provides  no  useful  infonnation),  q.  would 
k  1 

equal  to  —  =  — .  All  the  q k  used  in  the  simulation  (see  Table  2)  imply  a  “useful” 

informant  that  provides  true  messages  with  probability  higher  than  that  generated  from  a 
uniform  distribution. 

For  each  set  of  parameter  values,  a  stream  of  30  messages  is  created  100  times. 

6.  Simulation  Results 

For  most  of  the  input  parameter  values,  the  different  methods  produce  similar 


results.  Figures  7  and  8  show  the  results  obtained  for  one  such  scenario,  where: 
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qk  =  q  =  0.7,  NU  =  0 .  Figure  7  depicts  the  percent  of  the  runs  in  which  the  true  departure 

time  has  the  highest  probability,  while  Figure  8  shows  the  average  probability  assigned  to 
the  true  departure  time. 


Probability  of  Correct  Departure  Time 


Bayes 

DS-Plausibility 

DS-Pignistic 

Zhang-Plausibility 

Mean-Plausibility 

Mean-Pignistic 


Figure  7.  Probability  of  picking  the  correct  departure  time.  100  runs  with  parameters 

qk  =  qe  =  o.i,NU  =  o. 


As  one  would  expect,  the  probability  of  choosing  the  correct  departure  time 
increases  with  the  number  of  messages  received,  for  all  methods.  This  occurs  because  the 
operator  gains  more  useful  information  regarding  the  departure  time.  As  the  number  of 
received  messages  increases,  the  probability  that  an  incorrect  departure  time  is  included 
in  more  messages  than  the  correct  departure  time  decreases. 

However,  the  probability  assigned  by  the  different  methods  to  the  correct 
departure  time  can  differ  significantly,  as  Figure  8  shows. 
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Estimated  Probability  of  the  Correct  Departure  Time 


Bayes 

DS-Plausibility 

DS-Pignistic 

Zhang-Plausibility 

Mean-Plausibility 

Mean-Pignistic 


Figure  8.  Probability  assigned  to  the  correct  departure  time.  100  runs  with  parameters 

qk  =  qe  =  0J,NU  =  0. 


Based  on  the  results  in  Figure  8  we  can  divide  the  six  methods  into  three  groups. 
The  probability  assigned  to  the  correct  departure  time  by  the  Bayesian  and  Zhang 
methods  increases  at  the  fastest  rate  as  a  function  of  number  of  messages  in  Figure  8.  The 
probability  the  two  Dempster-Shafer  combination  rules  (plausibility  and  pignistic)  assign 
to  the  correct  departure  times  increases  at  a  slower  rate  with  the  number  of  messages, 
compared  to  the  Bayesian  update  and  the  Zhang  combination  rule.  The  reason  for  that  is 
that  the  two  Dempster-Shafer  combination  rules  do  not  take  into  account  the  sizes  of  the 
sets  combined  when  assigning  the  mass  of  the  intersection  of  those  sets,  as  described  in 
Chapter  II.  Interestingly,  the  mean  combination  methods  (plausibility  and  pignistic)  do 
not  change  the  prior  probability  assigned  to  the  correct  departure  time.  Let  us  look 
deeper  into  this  phenomenon  by  examining  the  pignistic-mean  method. 


A  single  message  is  true  with  probability  qk  and  false  with  probability  1  -  q k . 
The  expected  probability  assigned  to  it,  according  to  the  pignistic  transformation  defined 

in  (2.50),  is  qk-  —  +  (l~qk)-~ — — «0.18  for  the  parameters’ values  qk  =  qe  =  0.7.  Now 
k  fi  k 

let  us  look  at  the  situation  after  two  messages,  M\  and  M\ .  The  mass  assignment  that 
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corresponds  to  those  messages  is:  /n,  (m'J  =  q,;,m]  [t  -  j  M\  j)  =  1  -  qk  and 

=  qk,m2(r  -  =  qk.  According  to  the  mean  rule  (2.47),  the  combined 

assignment  would  be: 


m 


1,2 


2 


Now  let  us  calculate  the  probability  assigned  to  the  correct  departure  time 
according  to  the  pignistic  transformation.  Let  us  define  0  <  N  <2  as  the  number  of 

messages  t  is  included  in,  as  in  section  A.  For  every  possible  value  of  N  we  calculate 
the  probability  of  this  N.  =  «  and  the  probability  assigned  to  time  t  given  it  is  included 
in  n  messages.  The  explicit  calculations  are  shown  in  the  following  table: 


n. 

in 

P(N.  =n.  ) 

\  in  in  J 

mu(t\Nh  =  \) 

0 

(l-?i)2=0.09 

l.1-9f+1.l-9,  =  l-91=0.05 

2  n—k  2  n—k  n—k 

l 

2^(1-^)=0-42 

1.*‘+1.1-»‘«0.16 

2k  2  n-k 

2 

q\  =  0.49 

i9L  +  i.9i=9io0.23 

2  k  2  k  k 

Table  3.  Probability  of  having  n  messages  that  include  t  and  the  mass  assigned  to  t. 


From  those  three  possibilities  we  can  calculate  the  expected  probability  assigned 


to  the  correct  time  would  be  E 


m 


,(')]=  E  P(N„-nh)-ma(t\N^nlty0.n. 


rtf  =0,1,2 


Doing  the  same  calculation  with  an  increasing  number  of  messages  yields  similar  results. 

Although  the  probability  of  the  correct  departure  time  derived  by  the  two  mean 
methods  (pignistic  and  plausibility)  is  much  lower  than  in  the  other  methods,  the 

probabilities  assigned  to  the  incorrect  times  are  - — «  0.10  (when  they  are  uniform). 

n  - 1 
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Thus,  as  Figure  7  illustrates,  using  the  mean  Dempster-Shafer  methods  yield  the  correct 
departure  time  in  most  cases.  We  conclude  that  although  the  mean  methods  assign 
incorrect  probabilities  to  the  departure  times,  they  still  point  to  the  departure  time  with 
the  highest  probability  as  accurately  as  the  other  methods. 

For  most  of  the  input  combination  of  Table  3,  the  methods  produced  nearly 
identical  results.  However,  there  are  two  sets  of  input  values  that  yielded 
non-trivial  differences  between  the  methods:  1)  When  true  reliability  value  is  low — 
qk  ~  q  =0.4 ,NU  >0.  For  these  input  values  the  Bayesian,  mean-pignistic,  and  Zhang- 
Plausibility  methods  had  the  highest  probability  of  picking  the  correct  departure 
time,  and  2)  when  the  estimated  reliability  is  lower  than  the  true  one 
qk  =  0.7,0.95,c/  =  0.4, NU  =  0.  For  these  input  values,  the  combination  methods  differ, 

with  Bayes’  and  mean-pignistic  methods  performing  the  best,  followed  by  the  Zhang- 
Plausibility. 

The  input  values  that  yield  the  greatest  difficulties  for  the  update  process  are  those 
with  low  informant  reliability  and  some  non-unifonnity  of  the  generated  messages,  or 
those  cases  where  the  estimated  reliability  is  less  than  the  true  reliability.  Unexpectedly, 
Bayesian  update  proves  to  be  robust  and  performs  near  the  top  over  all  scenarios,  even 
when  Bayes’  assumptions  do  not  hold.  Mean-pignistic  perfonned  nearly  as  well  as 
Bayes,  with  Zhang-plausibility  perfonning  slightly  worse.  The  Dempster-Shafer 
combination  rule  and  mean  plausibility  rule  do  not  perform  as  well. 

As  an  example  for  one  of  cases  where  the  methods  differ,  let  us  look  closely  at  the 
results  for  input  values  of  qk  =  qe  =  0.4,  M7  =  0.4 ,  in  Figure  9: 
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Probability  of  Correct  Departure  Time 


Bayes 

DS-Plausibility 

- DS-Pignistic 

Zhang-Plausibility 
—  Mean-Plausibility 
Mean-Pignistic 


Figure  9.  Probability  of  picking  the  correct  departure  time.  100  runs  with  parameters 

qk  =qe  =  0.4,M7  =  0.4. 


Bayes,  Zhang  and  mean-pignistic  methods  clearly  outperform  the  other  methods. 
The  probability  of  choosing  the  right  departure  time  is  increasing  slowly  for  these  cases, 
but  is  actually  decreasing  for  the  other  methods.  The  intuition  behind  this  phenomenon  is 
the  disregarding  of  the  set  sizes  (as  discussed  in  this  section)  and  the  bias  caused  by 
messages  generated  non-uniformly  towards  wrong  departure  times  that  mislead  the 
Dempster-Shafer  combination  rules.  Figure  10  emphasizes  this  point  by  showing  how  the 
probability  of  the  true  departure  time  changes  as  the  number  of  messages  increases: 


Estimated  Probability  of  the  Correct  Departure  Time 


Bayes 

DS-Plausibility 

DS-Pignistic 

Zhang-Plausibility 

Mean-Plausibility 

Mean-Pignistic 


Figure  10.  Probability  assigned  to  the  correct  departure  time.  100  runs  with  parameters 

qk  =  qe  =  0.4,M7  =  0.4. 
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It  is  not  surprising  that  the  Dempster-Shafer  combination  rule  perfonns  poorly 
since  it  does  not  take  into  account  the  size  of  the  combined  sets.  Interestingly,  mean- 
pignistic  method  performs  better  than  mean-plausibility  method,  implying  that  the 
pignistic  transformation  is  more  appropriate  in  our  context.  This  is  consistent  with  the 
conclusion  found  in  (Smets,  2002)  that  the  pignistic  transformation  is  proper  whenever 
translating  beliefs  to  decisions  is  required. 

It  is  not  trivial  that  the  probability  of  choosing  the  correct  departure  time 
decreases  for  some  of  the  methods.  This  can  be  explained  by  the  non-uniformity 
parameter  that  causes  messages  that  are  biased  toward  incorrect  departure  times,  and  by 
the  difficulty  to  cope  with  low  reliability  values. 

D.  DISCUSSION 

In  this  chapter  we  compared  the  characteristics  and  perfonnance  of  the  updating 
methods  described  in  Chapter  II.  The  Bayes’  method  is  mathematically  rigorous  but 
requires  a  number  of  assumptions  not  needed  for  the  Dempster-Shafer  methods. 
However,  there  are  several  ways  to  implement  Dempster-Shafer  update,  and  it  is  not  clear 
in  advance  which  implementation  would  be  most  appropriate  for  a  given  scenario. 

We  have  developed  a  simulation  and  used  it  to  compare  the  different  updating 
methods  under  different  conditions.  Our  analysis  reveals  that  even  when  the  assumptions 
of  the  Bayes’  update  process  are  violated,  that  is,  if  the  messages  provided  by  the 
informant  are  not  constructed  uniformly,  it  still  manages  to  yield  the  best  results.  The 
Dempster-Shafer  methods  did  not  perform  better  than  Bayes’  update  method  even  though 
they  do  not  explicitly  assume  uniformity.  Amongst  the  Dempster-Shafer  methods, 
Zhang’s  and  the  mean  combination  rules  perform  better.  Amongst  the  transfonnations 
from  Belief  to  probabilities,  the  pignistic  transfonnation  was  found  to  be  more 
appropriate  in  our  scenario  because  the  probability  assigned  is  used  for  decision-making, 
and  not  just  for  representing  the  measure  of  belief. 

All  the  methods  performed  poorly  when  the  reliability  of  the  informant  is  low,  or 
mistaken  to  be  low,  and  there  is  non-uniformity  in  the  way  he  produces  messages. 
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IV.  EXTENSIONS 


In  the  previous  chapters  we  assume  a  single  vessel  that  can  only  travel  on  one 
route.  In  this  chapter  we  extend  the  model  to  include  multiple  routes  and  multiple  vessels. 


A.  SINGLE  VESSEL  AND  MULTIPLE  NON-INTERSECTING  ROUTES 

Let  us  consider  the  case  where  the  vessel  may  use  one  of  multiple  routes  that  do 
not  intersect.  Recall  in  Chapter  II  that  we  assume  the  speed  of  the  target  is  fixed  and 
known  and  therefore  its  location  at  any  time  could  be  derived  from  the  departure  time. 
Now,  the  location  is  determined  by  the  departure  time  and  route. 

Let  R  =  jriv..,r  j  be  a  set  of  non-intersecting  routes.  Thus,  the  tuple 
w  eW  cf  xR  is  sufficient  to  describe  the  target  location  at  any  given  time.  We  also 
redefine  n  as  the  number  of  possible  combinations  of  departure  time  and  route,  n  =  |  JPj 

We  assume  that  an  observation  gives  us  information  regarding  a  combination  of 
departure  time  and  a  route  (“The  Radar  has  detected  a  vessel  leaving  at  8  on  the  route  that 
is  close  to  the  coast”).  A  message  from  the  infonnant  relates  to  a  subset  of  the  possible 
departure  time  and  route  combinations,  for  example:  “The  vessel  leaves  at  8  or  10  am  on 
route  1  or  2,”  or  “The  vessel  will  be  on  route  3,  departure  time  is  unknown.” 

Let  w’  be  a  two-dimensional  parameter  denoting  the  departure  time  and  route. 
Now  we  can  perform  either  the  Bayesian  or  Dempster-Shafer  theory  updates  as  in 
Chapter  II.  For  instance,  the  Bayesian  update  function  after  a  positive  observation  tv' 
regarding  departure  time  and  route  would  be  (as  in  Equation  (2.5)): 


f  (  = 

J  updated-  \  ) 


1  -P, 


_ A _ 

‘-Af/M+M1-/ M) 


w=w 


W^W 


(4.1) 
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This  allows  us  to  calculate  the  probability  that  the  target  has  departed  at  time  t 
from  route  r ,  for  every  combination  of  departure  time  and  route  w  =  [t,r). 

In  order  to  update  messages,  we  must  make  similar  assumptions  to  the  ones  made 
in  Chapter  II.  We  assume  here  that  all  the  true  messages  of  the  same  size  are  equally 
likely,  where  “size”  refers  to  the  number  of  tuples  of  departure  time  and  routes 
combinations  in  the  message.  We  also  assume  that  the  probability  that  a  message  is  true 
is  detennined  by  the  size  k . 

Following  Equation  (2.14),  the  update  after  a  message  that  is  true  with  probability 
qkis 


A 

k 


w'eM 


w'<£M 

n-k 


l_ZA 

n  -  k 


;'eM 


w'(£M 


1ZA 

n-k 


w  eM 


W  &.M 


(4.2) 


Applying  Dempster-Shafer  theory  methods  to  this  case  is  also  straightforward.  If 
p  is  the  probability  of  a  true  observation  regarding  departure  time  and  route  w’  ,  the 
corresponding  mass  assignment  is: 

=  p 

m(W-{w'})  =  l-p  (4.3) 


and  the  mass  assignment  for  a  message  M  that  includes  k  departure  times  and  routes  is: 

m(M)  =  qk 

m{W-M)  =  \-qk  (4 
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Now  that  we  have  assigned  masses  to  the  subsets  W,  the  combination  rules  and 
probability  transformations  discussed  in  Chapter  II  can  be  applied  exactly  in  the  same 
manner. 

Messages  already  apply  to  multiple  values  in  the  basic  model  discussed  in 
Chapter  II,  and  therefore  this  extension  is  not  relevant  for  updating  messages  -  the  update 
mechanism  for  the  messages  is  exactly  as  in  Chapter  II.  In  the  following  sections  we  will 
discuss  only  the  application  of  extensions  to  the  update  process  of  observations. 

B.  MULTIPLE  ROUTES  WITH  INTERSECTIONS 

Intersecting  routes  do  not  affect  the  updating  mechanism  regarding  the  messages 
because  the  time  and  the  route  aspects  of  the  problem  decouple.  However,  if  routes  can 
intersect  an  observation  may  apply  to  more  than  one  route.  Let  us  assume  that  we  receive 
an  observation  O  regarding  an  intersection  point  that  may  apply  to  k  tuples  of  routes 
and  departure  times.  An  observation  may  apply  to  more  than  one  departure  time  if,  for 
instance,  one  of  the  routes’  departure  points  is  further  from  the  intersection  point. 

Figure  11  depicts  an  example  to  such  a  scenario.  In  this  example,  an  observation 
is  made  at  9:00  in  the  intersection  of  multiple  possible  routes.  This  observation  can  be 
applied  to  the  tuples  (departure  time  is  6:00,  “Northwest-Southeast  route”),  (departure 
time  is  7:00,  “middle  route”)  and  (departure  time  is  6:00,  “Northeast-Southwest  route”). 


Figure  11.  Intersecting  routes. 
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Let  us  define  an  observation  as  a  subset  of  the  possible  tuples  that  it  relates  to 
Oc  W .  We  define  the  false  negative  and  false  positive  probabilities  of  the  sensor  as  in 
Chapter  II;  we  denote  the  positive  observation,  meaning  “there  is  something  that 
corresponds  to  locations  and  departure  times  O”  as  O  and  a  negative  observation,  “The 
sensor  did  not  recognize  anything  as  belonging  to  O  The  False  positive  error  is 
defined  as  P[0+  |  w'^0)  =  Pf+  and  the  false  negative  error  as p[0  |  w'eOj  =  Pf  .  It  is 
reasonable  to  assume  that  the  errors  do  not  depend  on  the  departure  time  and  route  and 
therefore  the  updated  probability  of  w  ’  following  a  positive  observation  0+  is: 


f  (  wA  = 

J  update  \  ) 


1  -P, 


f- 


1  -P 


w  eO 


w'eO 


w'gO 


/+ 

n-k 


/+ 

n  -  k 


(4.5) 


1  -P. 


w  <£0 


w'eO 


w'gO 


_j± 

n  -  k 


The  derivation  of  (4.5)  follows  the  same  derivation  as  Equation  (2.5)  and  is 
similar  to  case  of  receiving  a  message  that  contains  multiple  departure  times  as  in 
Equation  (2.14),  but  with  one  difference:  while  for  messages  we  had  to  assume  some 
uniformity  among  the  messages  produced,  here  the  fact  that  the  errors  are  independent  is 
sufficient. 

If  the  vessel  can  switch  routes  at  the  intersection  points,  as  shown  in  Figure  12, 
the  situation  is  slightly  more  complicated: 
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Figure  12.  Intersecting  routes  with  possibility  of  switching  routes. 

In  this  case  we  can  formulate  the  problem  as  consisting  of  four  routes,  two  of 
which  overlap  at  the  departure  and  arrival  points,  and  all  four  intersect  in  the  middle 
point.  Similarly,  wherever  there  is  an  intersection  of  r  routes  in  an  intersection  point,  we 
can  define  r 1  distinct  routes.  After  this  small  alteration  we  can  apply  the  update  process 
defined  in  (4.5)  to  solve  this  case  as  well, 

C.  MULTIPLE  TARGETS 

Dealing  with  multiple  targets  is  a  much  more  complicated  topic.  However,  when 
the  total  number  of  targets  in  the  area  of  interest  is  known,  a  similar  scheme  to  the  one 
presented  in  sections  A  and  B  of  this  chapter  can  be  applied.  We  assume  that  there  is  only 
one  possible  route,  as  in  Chapter  II. 

Let  us  assume  that  there  are  u  targets  Z  =  |z1,...,zk|  .  The  set  w=  {cz}  specifies 
that  target  z  departed  at  time  t.  We  redefine  our  space  of  interest  as  W  qTxZ, 

|  Jfj  =  n,  which  describes  every  possible  vessel’s  departure  time  and  identity. 

An  observation  can  relate  to  a  specific  subset  of  departure  times  and  vessels.  If 
the  sensor  can  recognize  vessels,  the  observation  will  include  only  a  single  one,  if  it  can 
classify  them  into  different  categories  than  the  observation  may  relate  to  a  subset  of  the 
vessels. 
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Assuming  that  the  sensor’s  false  positive  and  false  negative  probabilities  do  not 
depend  on  the  type  of  the  target,  the  update  function  following  a  positive  observation  O 
is: 


w  eO 


w  <£0 


(4.6) 


Exactly  as  in  (4.5) 

If  the  sensor  characteristics  depend  on  the  type  of  the  vessel  (which  is  reasonable, 
since  bigger  vessels  are  easier  to  detect,  and  debris  are  more  likely  to  be  misclassified  as 
small  vessels),  the  situation  is  more  complicated.  Now  the  false  positive  error  is  a 

function  of  the  target  P(0+  \  w  <?0,zj  =  P/  (zj  and  likewise  the  false  negative  error  is 
P(0_\weO,z)  =  Pf  (z). 


The  update  process  can  be  formulated  as: 


( “■= ('■-)) 
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w'=(  t\z')€0 


AM 
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w  iO 


w'=\t\z'\<=0 


This  scheme  works  nicely  for  a  known  number  of  targets,  however  this  Bayesian 
update  process  is  straightforward  to  implement  in  cases  where  the  number  of  vessels  is 
not  known  or  changes  during  the  scenario. 
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D.  DIFFERENT  VELOCITIES 

Let  us  again  consider  the  case  of  a  single  target  and  a  single  route  but  multiple 
fixed  velocities.  Let  us  define  a  set  of  possible  velocities  V  =  j  v,,...,  vh  j  and  discuss  the 

probability  distribution  ofjh,vj  =  w  with  our  space  of  interest  being  W  cfxf  and 

\w\  =  n. 


An  observation  that  was  taken  at  time  t  regarding  a  certain  location  -  say  at  a 
distance  d  along  the  route  needs  to  be  translated  to  the  corresponding  subset  of  W :  all 
the  departure  time  and  velocity  combinations  that  would  bring  the  vessel  to  the  location 
of  the  observation  at  the  time  it  was  taken.  The  subset  that  corresponds  to  this  observation 
can  be  constructed  by: 


0  =  [w'  =  [t\v')\(t-t')-v'=  </} 


(4.8) 


Now  that  we  have  established  the  subset  of  W  that  the  observation  refers  to,  we 
can  continue  with  the  update  process  as  in  the  previous  sections. 


E.  SUMMARY 

In  this  chapter  we  have  proposed  several  extensions  to  the  basic  model  developed 
in  Chapter  II  to  accommodate  for  multiple  routes  (with  and  without  intersections), 
multiple  targets  and  multiple  velocities.  The  probability  updating  process  for  both 
observations  and  messages  is  very  similar  to  the  processes  discussed  in  Chapters  II  and 
III  and  requires  only  small  changes  to  accommodate  those  extensions.  Therefore  we 
expect  that  the  results  found  in  Chapter  III  extend  to  these  cases  as  well.  Accommodating 
multiple  extensions  simultaneously  requires  more  bookkeeping  and  notation,  but  is 
straightforward  using  the  techniques  described  in  this  chapter. 

Known  correlations  between  the  parameters  of  the  model,  such  as  the  velocity  and 
the  type  of  vessel,  can  be  incorporated  into  the  prior  distribution.  The  prior  can  also 
account  for  intelligence  regarding  the  likelihood  of  the  possible  routes. 
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V.  ASSESSING  THE  INFORMANT’S  RELIABILITY 


In  previous  chapters  we  assumed  that  the  reliability  qk  of  an  infonnant  is  known. 

In  reality  this  will  not  be  known.  As  the  operator  receives  more  information  from  an 
informant,  he  should  update  his  estimate  of  the  informant’s  reliability.  In  this  chapter  we 
develop  a  method  of  estimating  and  updating  the  reliability  parameter  qk ,  together  with 

the  distribution  of  the  departure  time  T  .  A  key  factor  in  updating  the  reliability  is 

whether  the  operator  is  able  to  verify  the  truthfulness  of  the  informant’s  previous 
messages.  In  section  A  we  examine  the  situation  where  the  operator  is  able  to  verily  the 
truth  of  the  message,  and  in  section  B  we  examine  the  more  challenging  situation  where 
the  operator  does  not  have  this  capability. 


A.  MESSAGE  TRUTH  CAN  BE  VERIFIED 


We  assume  that  the  operator  does  not  know  the  exact  reliability  parameter  qk ,  but 
has  some  prior  knowledge  about  the  probability  distribution  of  it.  Recall  that  qk  is  the 

probability  that  a  message  of  size  k  contains  the  true  departure  time.  Equivalently  we  can 
interpret  it  as  the  long  run  fraction  of  messages  that  are  true/correct.  Over  time,  the 
operator  observes  whether  messages  of  a  certain  size  are  true  and  uses  this  infonnation  to 
update  the  probability  distribution  of  qk .  This  situation  is  similar  to  the  problem  of 

flipping  a  biased  coin  with  unknown  probability  for  obtaining  a  Head  and  updating  that 
probability  over  time  based  on  the  observed  outcome  of  the  flips.  We  treat  the  reliability 
parameter  as  a  random  variable  Q  that  receives  the  values  0  <  q  <  1 .  Let  us  define  X 

as  a  random  variable  denoting  whether  the  message  received  is  verified  to  be  true  -  X  =  1 
or  untrue  -  X  =  0 .  The  probability  of  receiving  a  true  or  false  message  is  (by  definition): 


P(x  =  x\qk)  =  qxk(l-qk)1 


(5.1) 
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We  can  use  the  expression  in  Equation  (5.1)  to  formulate  an  update  for  the 
distribution  of  Q  based  on  the  veracity  of  the  most  recent  message 


P(Ql  =  gk\X  =  x)  = 


p(x  =  x 

k- 

)-p . 

)  prior 

Sri 

II 

oi 

P[x  =  x) 

\l-x 


P(x  =  x) 

where  P^X  -  xj  is  the  probability  the  operator  verifies  the  message  as  true. 


(5.2) 


We  would  like  the  posterior  distribution  of  Q  to  be  from  the  same  family  as  its 

prior,  so  that  the  calculations  are  tractable.  In  Bayesian  tenninology,  this  characteristic  is 
called  a  “conjugate  prior.”  The  Beta  distribution  satisfies  this  condition  for  the  case  of 
Bernoulli  trials  and  therefore  it  is  common  to  use  the  Beta  distribution  as  the  prior  (see 
Berger,  1993).  The  Beta  distribution  has  two  parameters  a,/3,  with  mean 

p\Qk  ]  =  — — — ,  anc*  probability  distribution  function: 


(5.3) 


where  the  denominator  of  (5.3)  is  the  Beta  function,  which  is  defined  as 

i 

f?(x,y)  =  jV_1(l-t)’  dt.  By  substituting  the  prior  (5.3)  into  the  Bayes’  theorem  (5.2) 

o 

we  obtain  the  posterior  distribution  of  Q  : 
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B\ 

(a,/?) 

fP( 

[*) 

(5.4) 


And  since  -PI  x )  is  merely  a  normalizing  term,  we  obtain: 


f0  [qk  \  x}~  Beta(^a  +  x,P  +  l-x} 

'  '  '  '  '  (5.5) 
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As  desired,  the  posterior  distribution  of  Qk  also  has  a  Beta  distribution  but  with 

different  parameters.  As  seen  from  equation  (5.5)  a  true  message  (x  =  1)  will  increase  the 
a  parameter  of  the  distribution  by  1  while  a  false  message  will  increase  the  (3  parameter 
by  1. 

Let  us  limit  the  current  discussion  to  a  single  informant,  providing  messages  of  a 
specific  size  k  and  reliability  Q  .  Although  qk  is  now  the  value  of  a  random  variable  we 
can  apply  the  same  reasoning  as  in  Chapter  II  in  conjunction  with  the  law  of  total 
expectation  to  generalize  the  calculation  of  receiving  a  specific  message  M  in  Equation 
(2.11): 


p(Mk  ITWH 


7 - teM 

I  n  —  \  ' 

{ *-i  j 

r^(l-£[?l])  ttM  (5.6) 
n  —  1 


As  in  Chapter  II,  we  assume  that  all  true  messages  of  a  certain  size  are  equally 
likely,  and  all  false  ones  are  also  equally  likely.  When  a  new  piece  of  intelligence  is 
received,  the  departure  time  distribution  can  be  updated  using  the  same  method  as  in 

Chapter  II,  while  replacing  qk  with  E^Qk  ]  which  equals  — ~~p  m  ^1C  case  °*'  ^1C  Beta 

distribution.  Since  the  departure  time  update  is  the  same  as  in  Chapter  II,  in  this  section 
we  focus  on  the  update  of  Qk . 

Let  us  discuss  the  values  we  should  assign  to  the  parameters  a,j3  in  the  prior 
distribution.  If  we  have  no  prior  knowledge  about  the  reliability  of  the  informant,  a 

common  uninformative  conjugate  prior  for  qk  is5eto|j^, This  is  known  as 

Jeffrey’s  prior  and  includes  the  minimal  information  regarding  the  distribution  of  Qk 
among  all  possible  priors  (Berger,  1993).  However,  if  we  do  have  a  prior  estimate  of  the 
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reliability  we  can  use  it  to  set  the  prior.  The  mean  reliability  is  given  by  - —  =  £[(?*.]• 

The  “weight”  of  the  prior  can  be  controlled  by  the  values  of  a  and  / 3  -  the  larger  a  and 
/?,  the  less  the  posterior  distribution  will  change  with  new  evidence.  Let  us  look  at  two 
cases:  in  the  first  case  a  =  J3  =  1  and  in  the  second  case,  a  =  /3=  10.  In  both  cases 


\ 

2’ 


however  after  a  single  true  message  that  has  been  verified,  the  estimated 


reliability  of  the  informant  in  the  first  case  would  beE^Qk 


<x  + 1  +  P 


—  while  in  the 
3 


second  case 


a  + 1 
ex  + 1  +  [5 


11 

21 


We  assume  that  the  larger  the  message  size  k  the  larger  L  [  (9,.  ]  will  be  since  there 
are  more  possibilities  for  the  message  to  contain  the  true  departure  time.  We  would  also 

expect  E\_Qk ]  - —  for  a  “useful”  infonnant  that  provides  correct  messages  with 

probability  higher  than  that  generated  uniformly  random. 

In  practice  the  verification  can  occur  in  many  ways,  for  example  later  intelligence 
may  confirm  without  any  doubt  the  location  of  the  vessel,  or  the  perhaps  by  the  capture 
and  interrogation  of  drug  smugglers.  Once  the  true  departure  time  of  the  vessel  is 
determined,  the  truthfulness  of  the  messages  received  can  determined  as  well,  and  can  be 
used  to  update  the  reliability  of  the  informant,  as  described  in  this  section.  The  updated 
reliability  can  then  be  used  to  better  estimate  the  departure  time  of  other  vessels  based  on 
messages  from  the  same  informant. 


B.  UNVERIFIED  MESSAGES 

In  many  situations,  the  operator  cannot  verify  the  informant’s  information. 
However,  we  still  want  to  use  the  new  information  provided  by  the  informant  to  update 
both  the  departure  time  distribution  and  informant’s  reliability.  For  instance,  if  we  receive 
a  message  that  contradicts  everything  we  know  so  far,  it  is  more  likely  that  the  informant 
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is  incorrect,  and  therefore  his  reliability  should  be  updated  downwards.  Unlike  the  update 
procedure  in  section  A  of  this  chapter  where  we  update  the  time  departure  distribution 
and  the  reliability  sequentially,  here  we  update  the  time  departure  distribution  and  the 
reliability  parameter  simultaneously. 


1.  The  Bayesian  Update  Process 

The  main  idea  behind  the  simultaneous  update  process  is  to  define  a  joint 
probability  distribution  for  the  time  departure  77  and  reliability  Q  ,  denoted  by 

fT  ()  (trql'j.  This  is  a  mixed  joint  distribution,  since  the  time  parameter  is  a  discrete 

random  variable  while  the  reliability  parameter  is  continuous.  As  in  section  A  of  this 
chapter,  the  random  reliability  parameter  Q  will  take  value  qk . 


The  update  of  the  joint  distribution  of  the  true  departure  time  Td  and  reliability 
parameter  Q  when  a  new  message  M k  is  received  can  be  calculated  by  applying  Bayes’ 
theorem: 


/new  (  j  \ 

T^QtVd^k)- 


PK)  (5.7) 

In  order  to  evaluate  (5.7)  we  need  to  define  the  prior  ( td,qk )  and  the  update 


p\ 

Td 

II 

-iti 

O) 

’"'S 

II 

p\ 

\ 

M . 

\  k ) 

\ 

The  prior  [td,qk),  is  combined  from  two  parts:  (1)  departure  time,  and  (2) 

the  reliability.  We  assume  that  these  two  components  are  independent  for  the  prior.  For 
the  departure  time,  the  prior  can  include  any  information,  however  without  further 

knowledge  we  assume  it  to  be  uniform  among  the  departure  times:  Td  ~f/[r].  For  the 
reliability  part  we  assume  that  a  prior  from  the  Beta  family  as  in  section  A  is  appropriate 
and  so  Q,  ~  Beta(a,P}.  The  independence  between  7^  and  Qk  holds  only  for  the  prior, 


but  not  for  the  updated  distribution.  The  final  expression  of  the  joint  distribution  prior  is: 
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(5.8) 


/sw='iii=<KW=' 


Next  let  us  derive  the  update 
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B{a,p) 

ftn  1 

II 

"^3 

'Mk 
<  k 

p(Mt) 


First  we  need  to  calculate  P(  M k  \  Td  =  t,Qk  -  qk  j .  As  in  Chapter  II,  Equation  (2. 1 1),  the 
probability  of  receiving  a  message  M  given  the  true  departure  time  Td  and  reliability 
parameter^  is: 


P(Ml\Ti=LQl=ql)  = 


(  q>  N 

[  n  -  1  1 

Jt-1 

1  -  q 

7 - 

I  n  — \  1 
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(5.9) 


By  defining  an  indicator  function /(s  e  M k  j  = 
two  cases  in  (5.9): 


1  ^  sM. 
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we  can  combine  the 
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(5.10) 


Notice  that  there  are  two  ways  for  the  probability  of  receiving  a  certain  message 
M .  in  (5.10)  to  be  large.  Either  the  informant  is  considered  reliable  (high  Q  )  and  the 
message  includes  the  true  departure  time,  or  the  message  does  not  include  the  true 
departure  time  and  the  informant  is  considered  unreliable  (low  Q  ) 
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The  denominator  of  the  update  function  P^Mk^j  is  the  nonnalizing  term  and  can 
be  calculated  by  integrating  over  Td  and  Qk : 

i 

(5.11) 

which  in  our  case  for  the  prior  translates  to: 

_>rr1  1  ,yfl  1  gT(l Z3*La 

t,uk\n  (  ^  B[a,p)  ^  ttM\n  f  n_\  'j  B[a,p)  q* 
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k  B[a  +  \,p)  n-k  B{a,P  +  \)  _k\(n-k)\  _  1  (5.12) 

f  n  —  1  ^  B[a,p )  (  n_\  ')  B^a,P^j  n\  (  n  \ 

"l  k-\  J  "l  k  J  U  J 


This  simple  expression  for  p(  Mkj  -  the  probability  that  the  informant  provides 
the  message  M  -  before  the  arrival  of  any  new  infonnation  is  not  surprising  given  the 

uniform  prior  and  uniform  genera, ion  of  messages.  There  are  (  »  )  poss.hle  messages 

that  can  be  generated  by  picking  k  departure  times  from  n  possibilities  and  so  the 

1 


probability  of  picking  one  of  them  uniformly  is 


( ,v 


Now  that  we  have  both  the  prior  and  the  update  function  we  can  evaluate  the  new 
joint  distribution  after  receiving  a  message: 
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(5.13) 
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Combining  the  prior  Equation  (5.8)  and  the  update  function  (5.13),  the  updated 
joint  distribution  is: 


r  y(seM*Vi  x  1/  \ 

fZ^)-AqT I  fe] 
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'u  J  lAij 


1 


n-k) 


ch 


'(■  -«.) 


n 


(5.14) 
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A  convenient  way  to  think  about  this  joint  distribution  is  by  envisioning  n 
different  reliability  parameters  Qki,  one  for  each  possible  departure  time.  Each  of  those 

variables  has  a  Beta  distribution  with  parameters  a. and/?..  To  update  the  joint 
probability  distribution  we  increase  a.  by  1  for  departure  times  that  are  included  in  the 
message  and  increase  /?  by  1  for  departure  times  that  are  not  included  in  the  message. 
This  increases  the  expected  Q  for  departure  times  that  are  included  in  messages,  and 
decreases  the  expected  Qk  of  departure  times  that  are  not  included. 

2.  The  Marginal  Distributions  after  a  Single  Message 

In  order  to  gain  more  insight  on  the  influence  of  a  message  on  the  distribution  of 
T  and  Q  ,  we  calculate  the  marginal  distributions  after  a  single  message. 


a.  The  Marginal  Distribution  of  T 

The  general  expression  for  calculating  a  marginal  distribution  of  a  joint 

i 

distribution  is  P{Td  =  /)  =  |  fQ  |r  =t  [t,q^dqk .  Applying  it  to  the  joint  distribution  defined 


o 

in  (5.14)  yields: 
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P(T=t)=  /  Y 
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(5.15) 


But  we  can  simplify  this  expression  since  the  Beta  function  has  the  property 
B(a  +  \,P )  a 

b[cc,P )  a  +  P 


-  so  the  marginal  distribution  of  the  departure  time  is  simply: 
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(5.16) 


Just  like  in  Chapter  II,  Equation  (2.15),  the  probability  of  a  certain 
departure  time,  given  a  message  M  depends  on  the  size  of  the  message  k,  and  the 

expected  value  of  the  reliability  of  the  message  — ~~p  =  ^\_Qk\  orc*cr  f°r  the 

marginal  distribution  of  the  times  that  are  in  the  message  to  increase,  we  require  that 

la  1  P  a  k  .  .....  . 

- > - ,  or  - >— .  We  can  interpret  this  condition  as  stating  the 

k  a  +  P  n-k  a  +  P  a  +  P  n 

informant  must  be  “useful,”  that  is,  one  that  has  a  higher  probability  to  generate  a  true 
message  than  if  one  were  to  pick  a  message  uniformly  at  random.  The  probability  that  the 
message  contains  the  true  value  is  equal  to  the  expected  reliability: 


p(TdsMk)  =  lc 


1  a 
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k  a+P  a+P 


=  £[e<] 


(5.17) 


which  is  exactly  the  desired  quality  of  the  reliability  notion. 
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b.  The  Marginal  Distribution  of  Qf 

The  general  formula  for  the  marginal  distribution  over  a  discrete  random 
variable  is  fQ  (c{k  j  =  ^  fQ  T  iflk,t)  which  in  our  case  translates  to: 

k  teT  k 
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(5.18) 


The  marginal  distribution  of  Qk  after  a  single  message  remains  the  same. 
This  result  is  intuitive.  Because  the  prior  distribution  of  the  departure  time  is  uniform,  we 
cannot  utilize  that  distribution  to  gain  information  about  whether  the  first  message  is  true 
or  false  (or  in  other  words,  all  the  possible  messages  are  as  likely  to  be  received)  and 
therefore  the  estimated  reliability  remains  unchanged.  We  cannot  draw  any  infonnation 
about  the  informant  from  his  message  because  we  did  not  have  any  specific  information 
about  the  departure  times.  If  the  prior  distribution  for  the  departure  time  was  not  uniform, 
the  marginal  distribution  of  Q  would  change. 


The  mean  of  the  reliability  is  the  same  as  before  this  message: 

E[Qk]=\fQkM^d^=- 


B[a  +  \,p) 
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(5.19) 


In  order  to  make  this  model  clearer,  let  us  look  at  a  numeric  example, 
following  the  example  in  Chapter  II,  section  D.4.  We  will  assume  that  there  are  only  four 
possible  departure  times :T  =  [tl,t2,ti,t4] ,  of  which  the  true  departure  time  is  //.  We 

receive  a  message  of  size  k=2,  M2  =  [tx,t2 }  and  the  reliability  our  informant  for  messages 
of  size  2  is  Q1  ~  Betai^a  =  9,/?  =  l)  (and £[(9,]  =  0.9).  The  prior  probability  distribution 
is  assumed  to  be  uniform:  f  .  ( f  )  =  f  .  ( L)=  f  .  ( t. )  =  0.25 

J  prior  \  1  /  J  prior  \  2  /  J  prior  \  5  )  J  prior  \  4  / 


The  joint  distribution,  as  developed  in  Equation  (5.14)  in  this  example  is 
calculated  to  be: 
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The  updated  marginal  distribution  will  be: 
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Which  is  in  full  agreement  with  the  results  in  (2.16). 
The  marginal  distribution  of  Q  is  now: 


4W= 


5(9,1)  5(9,1)  5(9,1) 


which  means  that  Qk  ~  5eto(9,l)  as  before  the  message. 


(5.20) 


(5.21) 


(5.22) 


In  the  following  section,  we  will  discuss  an  example  where  the  marginal 
distribution  of  Q  does  change  after  the  update. 


3.  The  Joint  Distribution  after  Two  Identical  Messages 

We  will  now  examine  how  the  joint  distribution  evolves  after  two  identical 
messages.  By  applying  the  update  process  described  in  (5.7)  twice,  with  the  new  prior 
being  the  resulting  posterior  distribution  after  the  first  update  (5.14)  we  obtain  the 
expression  for  the  distribution  after  two  identical  messages  M k  with  C  as  the 
nonnalizing  constant: 
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By  inserting  all  the  constants  into  the  nonnalizing  parameter  and  renaming  it,  we 


obtain: 
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f  1  X  2  /(seMi) 

kj 
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(5.24) 


We  calculate  the  normalizing  tenn  C  by  integrating  the  joint  density  to  1: 
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And  now  we  can  calculate  the  updated  distribution  in  (5.24).  Unlike  in  section  2b 
of  this  chapter,  the  marginal  distribution  of  Qk  now  changes  after  the  second  message. 

We  compute  fQ  ,  {qk,t)  using  the  new  distribution  after  two  messages 


teT 


obtained  in  (5.24) 
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(5.26) 


We  can  see  that  the  marginal  distribution  of  Q  that  was  a  single  Beta  distribution 
before  the  update  became  a  mixture  of  two  Beta  distributions  after  the  update.  The  mean 
of  Qk  after  two  messages  is: 
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This  is  a  reminiscent  of  the  requirement  of  - >  —  for  the  informant  to  be 

a  +  P  77 

considered  useful  from  the  section  2.b.  Note  that  in  cases  when  the  informant  is  not 
“useful”  (meaning,  his  mean  reliability  is  lower  than  what  it  would  be  if  he  had  picked 

departure  times  randomly),  the  updated  E^Qk ]  might  decrease.  This  occurs  because  if 

we  believe  the  informant  is  very  unreliable  then  we  believe  that  the  departure  times  that 
he  stated  in  the  message  are  less  likely,  and  in  this  light,  we  downgrade  his  reliability 
even  further. 
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4.  Visualization  of  the  Update  Process 

We  develop  a  small  simulation  that  visualizes  the  joint  distribution  update 
process.  In  the  following  example  n  =  9,k  =  3  and  the  initial  Q  parameters  value  are 

a  =  4,/?  =  1  leading  to  £[£?*]  =  0.8 . 

The  joint  distribution  before  any  messages  arrive  is  depicted  in  Figure  13: 


Figure  13.  Joint  prior  distribution. 


As  we  can  see  the  variables  are  independent.  The  departure  time  is  uniform,  and 
the  higher  values  of  Q  are  more  likely.  Now  let  us  examine  what  happens  after 

receiving  a  single  message  Mk  =  j  7j ,  7j ,  7^  j .  The  joint  distribution  changes  to: 
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Joint  distribution  of  correctness  and  departure  time 
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Figure  14.  Joint  distribution  after  a  single  message. 


This  implies  that  either  the  true  departure  time  is  one  of  jr4,T5,r9  J and  the 

reliability  of  the  informant  is  rather  high,  or  the  true  departure  time  is  not  one  of  the 
above,  and  the  informant  reliability  is  low.  If  we  receive  another  message 

M k  =  j 7i ,T? , 7^  j ,  the  distribution  update  can  be  seen  in  Figure  15: 
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Figure  15.  Joint  distribution  after  a  two  messages. 

It  is  very  likely  that  the  true  departure  time  is  T9,  with  a  very  reliable  informant. 
However,  there  is  still  some  probability  that  the  true  departure  time  is  not  T9  and  the 
reliability  is  lower. 

5.  Change  of  Td  and  Q .  with  Number  Messages 

It  is  also  interesting  to  examine  the  distribution  of  Td  and  the  mean  of(2;  after 
multiple  messages.  In  the  following  example,  n  =  2,k  =  l  and  the  initial  (^parameters 
value  are  a  =  3,J3  =  2  leading  to  E^Qk ]  =  0.6 . 

Let  us  assume  we  receive  10  identical  messages:  M k  =  jlj  j.  The  distribution  of 
T  as  function  of  number  of  messages  is  plotted  in  Figure  16: 


Joint  distribution  of  correctness  and  departure  time 
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Figure  16.  Marginal  distribution  of  Td . 


As  expected,  the  probability  of  T  increases  with  the  number  of  messages,  while 
the  probability  of  T2  decreases. 

Figure  17  shows  how  the  mean  of  Q  changes  with  the  number  of  messages 
received: 
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Figure  17.  Mean  of  Qk . 

As  more  messages  arrive,  the  mean  of  Q  increases.  The  repeated  messages 
confirm  both  the  departure  time  (as  shown  in  Figure  16)  and  the  informant’s  reliability. 
Note  that  the  first  message  does  not  change  the  mean  of  Q  . 

Now  to  a  slightly  more  interesting  example:  let  us  assume  that  n  =  2,k  =  1  and  the 
initial  Qk parameters  value  are  a  =  2,/?  =  3  leading  to  £ =  0.4 .  The  mean  of  Qk  now 

is  lower  than  the  uniform  probability  of  picking  T  at  random,  and  therefore  the 
informant  is  more  likely  to  provide  incorrect  information.  Again,  we  assume  we  receive 
10  identical  messages:  M k  =  {ij  j 

The  marginal  distributions  in  this  case  are: 
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Marginal  Distribution  of  Td 


Figure  18.  Marginal  distribution  of  Td . 

Because  the  initial  reliability  is  so  low,  we  believe  the  informant  is  misleading  us 
and  thus  the  probability  for  T  decreases  and  the  probability  of  T,,  increases. 
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Figure  19.  Mean  of  Qk . 

The  mean  of  Qk  is  decreasing  with  the  number  of  messages.  This  result  is  also 
quite  intuitive:  since  the  probability  of  T  is  decreasing,  informant’s  repeated  messages 
that  include  T  are  considered  less  true  and  the  estimated  reliability  of  the  informant  is 
downgraded. 

C.  DISCUSSION 

We  have  proposed  a  scheme  for  simultaneous  update  of  the  reliability  and  the 
estimation  of  the  departure  time.  This  scheme  includes  a  mixed  joint  distribution  of  a 
discrete  ( Td )  and  a  continuous  ( Q  )  random  variables  that  updates  in  a  Bayesian  fashion. 

As  we  saw  in  Chapter  III,  even  if  the  assumptions  are  not  fulfilled,  the  performance  of  the 
Bayesian  method  is  satisfactory. 

The  scheme  proposed  makes  use  of  conjugate  Beta  functions  ensuring  simple 
calculations  that  are  easy  to  implement  while  maintaining  flexibility  to  specify  the  mean 
reliability  of  an  informant  and  the  “strength”  of  our  estimation  regarding  this  reliability. 
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Although  the  reliability  is  unknown,  and  the  true  departure  time  is  also  unknown 
we  can  still  estimate  both  of  them  and  improve  our  estimate  as  we  receive  more 
intelligence.  The  scheme  takes  full  use  of  the  infonnation  provided  by  the  informant  to 
efficiently  update  the  infonnant’s  reliability  and  the  vessel’s  location  simultaneously. 


77 


THIS  PAGE  INTENTIONALLY  LEFT  BLANK 


78 


VI.  SUMMARY,  CONCLUSIONS,  AND  FUTURE  WORK 


A.  DATA  FUSION 

In  this  thesis  we  fonnulate  a  model  to  assist  the  Joint  Interagency  Task  Force 
South  in  its  efforts  to  fight  drug  traffickers  originating  from  South  America.  The  main 
problem  addressed  in  this  thesis  is  how  to  combine  different  sources  of  intelligence  into  a 
coherent  picture  to  effectively  estimate  the  location  of  drug  smugglers.  In  the  initial 
model,  we  focus  on  determining  the  departure  time  of  a  smuggler.  In  later  chapters  we 
develop  methods  to  estimate  the  route  the  smuggler  travels,  the  vessel  type,  and  velocity. 
The  main  contribution  of  this  thesis  is  developing  models  to  fuse  information  from  two 
different  types  of  intelligence  sources,  namely  sensor-based  sources  and  human-based 
sources,  into  a  coherent  intelligence  picture.  We  update  this  picture  as  new  infonnation 
arrives. 

The  main  model  we  explore  is  the  Bayesian  model,  which  is  quite  intuitive, 
mathematically  rigorous  and  elegant.  However,  this  method  requires  assumptions 
regarding  the  underlying  probability  distributions  related  to  the  intelligence  gathered. 
Those  assumptions  are  usually  difficult  to  justify  in  practice  since  their  validation 
requires  gathering  large  amounts  of  data. 

We  compare  the  Bayesian  model  to  a  different  type  of  intelligence  updating 
mechanism,  the  Dempster-Shafer  method.  We  examine  several  ways  to  implement 
Dempster-Shafer  theory  and  compare  those  methods  to  Bayes’  theory  both  qualitatively 
and  quantitatively.  The  quantitative  comparison  is  done  using  a  simulation  across 
multiple  possible  values  of  an  informant’s  reliability  and  ways  in  which  the  messages  are 
created. 

We  found  that  even  when  the  assumptions  of  the  Bayes’  update  process  are 
violated,  it  still  manages  to  yield  the  best  results  in  the  scenarios  examined.  It  specifies 
the  correct  departure  time  a  larger  fraction  of  the  time  than  the  other  methods.  All  the 
updating  methods  perform  poorly  when  the  reliability  of  the  informant  is  low  or  is 
mistaken  to  be  low,  and  there  is  non-uniformity  in  the  way  he  produces  messages. 
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B.  UPDATING  THE  INFORMANTS  RELIABILITY 

A  major  contribution  of  this  thesis  is  a  Bayesian  model  that  allows  the  operator  to 
assess  the  reliability  of  the  informant  and  update  the  vessels  location  simultaneously.  We 
can  do  this  when  the  informants’  messages  can  be  verified  and  when  they  cannot.  The 
informant’s  reliability-departure  time  joint  distribution  model,  described  in  detail  in 
Chapter  V,  allows  estimating  both  the  location  of  the  vessel  and  the  reliability  of  the 
informant  together  and  updating  the  estimate  as  more  intelligence  is  received.  Even 
though  neither  the  true  departure  time  of  the  vessel  nor  the  reliability  of  the  informant  are 
known  initially. 

C.  FUTURE  WORK 

This  thesis  suggests  multiple  models  for  updating  the  operator’s  perception  as  he 
receives  more  intelligence  and  sets  a  framework  for  the  comparison  and  evaluation  of 
those  data  fusion  models.  However,  the  research  on  the  models  developed  can  be 
extended  in  the  following  ways: 

1.  Extending  the  Model 

In  Chapter  IV,  multiple  extensions  to  the  basic  model  were  suggested,  but  in  order 
to  encompass  real-life  situations,  one  may  extend  the  model  even  further. 

Possible  extensions  of  interest  are: 

•  Accounting  for  variable  velocities  of  the  vessels. 

•  Accounting  for  the  case  where  the  number  of  vessels  in  the  theater  of 
operations  changes  over  time. 

•  Evaluating  the  probability  that  the  informant  delivers  a  message  of  a 
certain  size  lk .  In  our  analysis  we  assumed  that  this  probability  is  known, 
but  in  fact  it  can  be  evaluated  as  more  infonnation  arrives  in  a  similar 
fashion  to  the  one  used  in  Chapter  V  to  estimate  the  reliability  qk. 
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2.  More  Extensive  Comparison 

Although  we  have  shown  that  the  Bayes’  model  preforms  best  in  the  scenarios 
examined,  the  intelligence  community  may  still  benefit  from  more  exhaustive 
comparison  between  the  models. 

A  worthwhile  direction  to  improve  the  comparison  conducted  in  this  thesis  is  by 
examining  the  models  with  other  streams  of  messages,  created  in  different  ways  than 
described  in  Chapter  III.  Examining  the  probability  of  specifying  the  correct  departure 
time  after  receiving  messages  of  different  sizes  from  multiple  infonnants  with  different 
reliabilities  may  also  be  of  interest 

Lastly,  comparing  the  computational  complexity  of  the  update  methods  directly 
by  computing  the  time  required  to  perform  the  computations  of  different  update  methods. 

3.  Real  Data 

Inputting  the  models  with  real  data  may  increase  immensely  the  insights  we  can 
gain  from  the  models  and  allow  us  to  compare  them  more  effectively.  Such  real  data  may 
relate  to  1)  prior  knowledge  about  the  vessels  departure  times,  velocities  and  routes,  2) 
the  characteristics  of  sensors,  namely  the  false  positive  error  P  and  the  false  negative 

error  P  and  3)  the  characteristics  of  the  informants  such  as  their  reliability  and  most 

interestingly  -  the  way  in  which  they  produce  their  messages,  and  what  types  of  mistakes 
they  tend  to  make. 
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